使用多个(python)客户端并行加载cassandra中的所有行
发布时间:2020-12-20 13:34:09 所属栏目:Python 来源:网络整理
导读:使用Cassandra推荐的RandomPartitioner(或Murmur3Partitioner)时,无法对键进行有意义的范围查询,因为行是 distributed around the cluster using the md5 hash of the key.这些哈希称为“令牌”. 尽管如此,通过为每个计算工作者分配一个标记范围来分割大表是
使用Cassandra推荐的RandomPartitioner(或Murmur3Partitioner)时,无法对键进行有意义的范围查询,因为行是
distributed around the cluster using the md5 hash of the key.这些哈希称为“令牌”.
尽管如此,通过为每个计算工作者分配一个标记范围来分割大表是非常有用的.使用CQL3,它似乎可能到issue queries directly against the tokens,但是下面的python不起作用…编辑:在切换到对最新版本的cassandra数据库(doh!)进行测试后工作,并且还更新下面的每个音符的语法: ## use python cql module import cql ## If running against an old version of Cassandra,this raises: ## TApplicationException: Invalid method name: 'set_cql_version' conn = cql.connect('localhost',cql_version='3.0.2') cursor = conn.cursor() try: ## remove the previous attempt to make this work cursor.execute('DROP KEYSPACE test;') except Exception,exc: print exc ## make a keyspace and a simple table cursor.execute("CREATE KEYSPACE test WITH strategy_class = 'SimpleStrategy' AND strategy_options:replication_factor = 1;") cursor.execute("USE test;") cursor.execute('CREATE TABLE data (k int PRIMARY KEY,v varchar);') ## put some data in the table -- must use single quotes around literals,not double quotes cursor.execute("INSERT INTO data (k,v) VALUES (0,'a');") cursor.execute("INSERT INTO data (k,v) VALUES (1,'b');") cursor.execute("INSERT INTO data (k,v) VALUES (2,'c');") cursor.execute("INSERT INTO data (k,v) VALUES (3,'d');") ## split up the full range of tokens. ## Suppose there are 2**k workers: k = 3 # --> eight workers token_sub_range = 2**(127 - k) worker_num = 2 # for example start_token = worker_num * token_sub_range end_token = (1 + worker_num) * token_sub_range ## put single quotes around the token strings cql3_command = "SELECT k,v FROM data WHERE token(k) >= '%d' AND token(k) < '%d';" % (start_token,end_token) print cql3_command ## this fails with "ProgrammingError: Bad Request: line 1:28 no viable alternative at input 'token'" cursor.execute(cql3_command) for row in cursor: print row cursor.close() conn.close() 理想情况下,我希望能够使用pycassa,因为我更喜欢它更加pythonic的界面. 有一个更好的方法吗? 解决方法
我已更新问题以包含答案.
(编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |