You should really listen to this talk "
Lessons Learned from Migrating 2+ Billion Documents at Craigslist" by Jeremy Zawodny.
However if you don't have 30 minutes to spare these are the main items:
1. Pay attention to encoding. MongoDB uses
UTF8 so you'll need to process your data if it has all sorts of encoding.
2. There's a
document size limit (defferrs from version to version) so if some of your documents are too big you should plan how to avoid this problem. Otherwise it will fail when you'll try to load them into MongoDB.
3. Pay attention to
data types (don't put everything as string) - otherwise you'll have trouble when querying. This is especially tricky when using dynamic typed programming languages. Also make sure that the
driver you use is not
inferring your data types.
4.
Sharding - when you first load the data you can stop the internal load balancer (to reduce IO) and you can also
pre split the data in advance.
5. consider using
file system that supports compression if you store lots of text.
and finally - join the mailing list. it has tons of information that would be very helpful.
No comments:
Post a Comment