Sunday 15 August 2010

cassandra - Query results not ordered despite WITH CLUSTERING ORDER BY -



cassandra - Query results not ordered despite WITH CLUSTERING ORDER BY -

i storing posts users in table. want retrieve post users user following.

create table posts ( userid int, time timestamp, id uuid, content text, primary key (userid, time) )with clustering order (time desc)

i have info user follows in table

create table follow ( userid int, who_follow_me set<int>, who_i_follow set<int>, primary key ((userid)) )

i making query

select * posts userid in(1,2,3,4....n);

2 questions:

why still info in random order, though clustering order specified in posts. ? is model right satisfy query optimally (user can have n number of followers)?

i using cassandra 2.0.10.

"why still info in random order, though clustering order specified in posts?"

this because order by works rows within particular partitioning key. in case, if wanted see of posts specific user this:

select * posts userid=1;

that homecoming results ordered time, of rows within userid=1 partitioning key clustered it.

"is model right satisfy query optimally (user can have n number of followers)?"

it work, long don't care getting results ordered timestamp. able query posts users ordered time, need come different partitioning key. without knowing much application, utilize column group (for instance) , partition on that.

so let's evenly assign of users 8 groups: a, b, c, d, e, f, g , h. let's table design changed this:

create table posts ( grouping text, userid int, time timestamp, id uuid, content text, primary key (group, time, userid) )with clustering order (time desc)

you query posts users grouping b this:

select * posts group='b';

that give of posts of users in grouping b, ordered time. basically, query order posts appropriately time, need partition post info on other userid.

edit:

primary key (userid, follows)) clustering order (created desc);

that's not going work. in fact, should produce next error:

code=2200 [invalid query] message="missing clustering order column follows"

and if did add together follows clustering order clause, see this:

code=2200 [invalid query] message="only clustering key columns can defined in clustering order directive"

the clustering order clause can used on clustering column(s), in case, follows column. alter primary key definition cluster on follows (asc) , created (desc). have tested this, , inserted sample data, , can see query works:

aploetz@cqlsh:stackoverflow> select * posts userid=2 , follows=1; userid | follows | created | id --------+---------+--------------------------+-------------------------------------- 2 | 1 | 2015-01-25 13:27:00-0600 | 559cda12-8fe7-45d3-9a61-7ddd2119fcda 2 | 1 | 2015-01-25 13:26:00-0600 | 64b390ba-a323-4c71-baa8-e247a8bc9cdf 2 | 1 | 2015-01-25 13:24:00-0600 | 1b325b66-8ae5-4a2e-a33d-ee9b5ad464b4 (3 rows)

although, if want query userid can see posts of followers. in case, posts ordered within each followerid, this:

aploetz@cqlsh:stackoverflow> select * posts userid=2; userid | follows | created | id --------+---------+--------------------------+-------------------------------------- 2 | 0 | 2015-01-25 13:28:00-0600 | 94da27d0-e91f-4c1f-88f2-5a4bbc4a0096 2 | 0 | 2015-01-25 13:23:00-0600 | 798053d3-f1c4-4c1d-a79d-d0faff10a5fb 2 | 1 | 2015-01-25 13:27:00-0600 | 559cda12-8fe7-45d3-9a61-7ddd2119fcda 2 | 1 | 2015-01-25 13:26:00-0600 | 64b390ba-a323-4c71-baa8-e247a8bc9cdf 2 | 1 | 2015-01-25 13:24:00-0600 | 1b325b66-8ae5-4a2e-a33d-ee9b5ad464b4 (5 rows)

cassandra cql

No comments:

Post a Comment