Indexing performance
Typically ɚ s/doc for filesystem, 2-10 s/doc for web
- performance highly dependent on gateway
- varies less with filters, but PDF is generally slow (don’t tell Adobe I said so)
Collections typically 20%-60% of dataset size
Improve insertion latency by committing smaller partitions