-
Cluster: Cluster is a set of Solr nodes managed as a single unit. The entire cluster must have a single schema and solrconfig
-
Node: A JVM instance running Solr
-
Partition: A partition is a subset of the entire document collection. A partition is created in such a way that all its documents can be contained in a single index.
-
Shard: A Partition needs to be stored in multiple nodes as specified by the replication factor. All these nodes collectively form a shard. A node may be a part of multiple shards
-
Leader: Each Shard has one node identified as its leader. All the writes for documents belonging to a partition should be routed through the leader.
-
Replication Factor: Minimum number of copies of a document maintained by the cluster
-
Transaction Log: An append-only log of write operations maintained by each node
-
Partition version: This is a counter maintained with the leader of each shard and incremented on each write operation and sent to the peers
-
Cluster Lock: This is a global lock which must be acquired in order to change the range -> partition or the partition -> node mappings.
Apache Solr 4.0-alpha 在 2012-07-03 发布,有N多激动人心的功能可以用了。下面简单列举一下:
Solr 4.0-alpha Release 显著功能:
- 分布式索引,支撑的功能有:near real-time(NRT),NoSQL 特性:realtime-get、乐观锁、持久更新(durable updates)。
- 高可用性,解决点故障。
- 依赖 zookeeper 的分布式协调、群集元数据和配置保存。不必考虑分布式一致性协议 Paxos。
- 更新自动转发到当前 shard 的所有结点。
- 查询自动执行分布搜索、负载均衡、故障切换。
- NoSQL 特性:
- 持久化的更新,使用事务日志。
- Real-time Get 不需要 commit 索引即可取到最新的数据。
- 版本控制 (Versioning) 与 乐观锁(Optimistic Locking) 结合 Real-time get 可以确保 read/update/write 操作不冲突。
- 原子更新,add,remove,change 和增加字段,不需要提供完整的 doc。
- solr 4.0 再来的特性:
- Pivot Faceting 支持多层的 facet。
- Pseudo-fields 支持重命名字段,包括输出函数值。
- 拼写检查支持直接从主索引里取数据。
- Join,支持查询关系其它 schema 的 document。
- 增强 Function query 支持条件函数、文本相关性函数。
- 新的更新处理器,支持在做索引之前更新 document。
- 新的 admin web 界面,支持SolrCloud。