Cassandra 操作

Cassandra实操

1.CQL Shell 客户端

[root@gbase bin]# ./cqlsh localhost 9042
Connected to Test Cluster at localhost:9042.
[cqlsh 5.0.1 | Cassandra 3.11.13 | CQL spec 3.4.4 | Native protocol v4]
Use HELP for help.
cqlsh> 

默认 cqlsh 会自动探测本机及端口。上面的操作时已经启动了 Cassandra 服务并绑定相关端口，
注：【端口列表】，cqlsh默认就会连接本机的9042端口。

从上面的命令可以看出 cqlsh 连接到名为 Test Cluster 的集群，这个名字是默认值，可以自定义，配置在conf/cassandra.yaml 文件的 cluster_name 参数，注：【yaml全内容】

2.cqlsh的基本命令

1665627940781

2.1DESCRIBE

#提供有关集群的信息
Describe cluster;

#显示当前Cassandra里的所有键空间
Describe Keyspaces;

#列出键空间的所有表
Describe tables;

#指定键空间
USE system_traces;

#列出system_traces 下的 sessions信息
cqlsh:system_traces> DESCRIBE sessions;

2.2Capture

#此命令捕获命令的输出并将其添加到文件。
CAPTURE '/root/cassandra/outputfile'

2.3show

#显示当前cqlsh 连接的Cassandra服务的ip和端口
SHOW HOST

# 显示当前的版本
SHOW VERSION

#显示会话信息，需要参数uuid
SHOW SESSION <uuid>

3.数据定义命令

1665628868102

3.1 操作键空间

--1.语法
CREATE KEYSPACE <identifier> WITH <properties>;

--2.更具体的语法
Create keyspace KeyspaceName with replicaton={'class':strategy name,   
'replication_factor': No of replications on different nodes};

--要填写的内容：
KeyspaceName  ： --键空间的名字
strategy name ： --副本放置策略，内容包括：简单策略、网络拓扑策略，选择其中的一个。
No of replications on different nodes ： --复制因子，放置在不同节点上的数据的副本数。

--例如
CREATE KEYSPACE school WITH replication = {'class':'SimpleStrategy', 'replication_factor' : 3};

3.2连接Keyspace

USE <identifier>;

3.3修改键空间

ALTER KEYSPACE <identifier> WITH <properties>

--编写完整的修改键空间语句，修改school键空间，把副本引子 从3 改为1
ALTER KEYSPACE school WITH replication = {'class':'SimpleStrategy', 'replication_factor' : 1};

3.4删除键空间

DROP KEYSPACE <identifier>

--例如
DROP KEYSPACE school;

4.操作表、索引

4.1创建表

CREATE (TABLE | COLUMNFAMILY) <tablename> ('<column-definition>' , '<column-definition>')
(WITH <option> AND <option>)

--完整创建表语句，创建student 表，student包含属性如下： 
--学生编号（id）， 姓名（name），年龄（age），性别（gender），家庭地址（address），interest（兴趣），    phone（电话号码），education（教育经历） 
--id 为主键，并且为每个Column选择对应的数据类型。 
--注意：interest 的数据类型是set ，phone的数据类型是list，education 的数据类型是map

CREATE TABLE student(
   id int PRIMARY KEY,  
   name text,  
   age int,  
   gender tinyint,  
   address text ,
   interest set<text>,
   phone list<text>,
   education map<text, text>
);

4.2cassandra的索引（5种Key）

这列的值很多的情况下，因为你相当于查询了一个很多条记录，得到一个很小的结果。
表中有couter类型的列
频繁更新和删除的列
在一个很大的分区中去查询一条记录的时候（也就是不指定分区主键的查询）

4.2.1Primary Key

是用来获取某一行的数据， 可以是单一列（Single column Primary Key）或者多列（Composite Primary Key）。
在 Single column Primary Key 决定这一条记录放在哪个节点。
--单列 主键
create table testTab (
id int PRIMARY KEY,
name text
);

4.2.2Composite Primary Key

如果 Primary Key 由多列组成，那么这种情况称为 Compound Primary Key 或 Composite Primary Key。
--组合主键
--Partition Key（key_one），CLUSTERING KEY（key_two）
create table testTab (
key_one int,
key_two int,
name text,
PRIMARY KEY(key_one, key_two)
);

4.2.3Partition Key

Cassandra会对Partition key 做一个hash计算，并自己决定将这一条记录放在哪个节点。
如果 Partition key 由多个字段组成，称之为 Composite Partition key

create table testTab (
key_part_one int,
key_part_two int,
key_clust_one int,
key_clust_two int,
key_clust_three uuid,
name text,
PRIMARY KEY((key_part_one,key_part_two), key_clust_one, key_clust_two, key_clust_three)
);

4.2.4Clustering Key

决定同一个分区内相同 Partition Key 数据的排序，默认为升序，可以在建表语句里面手动设置排序的方式

4.3修改表结构

--添加列，语法
ALTER TABLE table name ADD  new column datatype;
--删除列，语法
ALTER table name DROP columnname;

4.4删除表

DROP TABLE <tablename>

4.5清空表

TRUNCATE <tablename>

5.创建索引

5.1普通列创建索引

CREATE INDEX <identifier> ON <tablename>

--为student的 name 添加索引，索引的名字为：sname， 代码：
CREATE INDEX sname ON student (name);

--为student 的age添加索引，不设置索引名字，代码
CREATE INDEX ON student (age);  --会有默认索引名

5.2集合列创建索引

CREATE INDEX ON student(interest);                       -- set集合添加索引
CREATE INDEX mymap ON student(KEYS(education));          -- map结合添加索引

5.3 删除索引

DROP INDEX <identifier>
--drop index sname;

6.查询数据

6.1查询所有数据

cqlsh:school> select * from student;

6.2查询时使用索引

Cassandra对查询时使用索引有一定的要求，具体如下：
--1.Primary Key 只能用 = 号查询
--2.第二主键 支持= > < >= <=
--3.索引列 只支持 = 号
--4.非索引非主键字段过滤可以使用ALLOW FILTERING

示例

--key_one 是第一主键，key_two是第二主键，age是索引列，name是普通列
create table testTab (
key_one int,
key_two int,
name text,
age  int,
PRIMARY KEY(key_one, key_two)
);
create INDEX tage ON testTab (age);

--1.key_one列是第一主键 对key_one进行 = 号查询，可以查出结果
select * from testtab where key_one=4;
--对key_one 进行范围查询使用 > 号，无法查出结果  

--2.不要单独对key_two 进行 查询 (不建议这么做)
select * from testtab where key_two = 8 ALLOW FILTERING;

--3.索引列 只支持=号
select * from testtab where age = 19;   -- 正确
 select * from testtab where age > 20 ;  --会报错
 select * from testtab where age >20 allow filtering;  --可以查询出结果，但是不建议这么做

--4.普通列，非索引非主键字段 
select * from testtab where key_one=12 and name='张小仙'; --报错
select * from testtab where key_one=12 and name='张小仙' allow filtering;  --可以查询

--5.集合列
select * from student where interest CONTAINS '电影';        -- 查询set集合
select * from student where education CONTAINS key  '小学';  --查询map集合的key值
select * from student where education CONTAINS '中心第9小学' allow filtering; --查询map的value值

--6.ALLOW FILTERING
ALLOW FILTERING是一种非常消耗计算机资源的查询方式。 如果表包含例如100万行，并且其中95％具有满足查询条件的值，则查询仍然相对有效，这时应该使用ALLOW FILTERING。

如果表包含100万行，并且只有2行包含满足查询条件值，则查询效率极低。Cassandra将无需加载999,998行。如果经常使用查询，则最好在列上添加索引。
--ALLOW FILTERING在表数据量小的时候没有什么问题，但是数据量过大就会使查询变得缓慢。

6.3查询时排序

cassandra也是支持排序的，order by。 排序也是有条件的
1）必须有第一主键的=号查询,cassandra的第一主键是决定记录分布在哪台机器上，cassandra只支持单台机器上的记录排序。
2）只能根据第二、三、四…主键进行有序的，相同的排序。
3）不能有索引查询
--cassandra的任何查询，最后的结果都是有序的，内部就是这样存储的
select * from testtab where key_one = 12 order by key_two;  --正确
select * from testtab where key_one = 12 and age =19 order key_two;  --错误，不能有索引查询

6.4分页查询

使用limit 关键字来限制查询结果的条数 进行分页

7.添加数据

INSERT INTO <tablename>(<column1 name>, <column2 name>....) VALUES (<value1>, <value2>....) USING <option>

INSERT INTO student (id,address,age,gender,name,interest, phone,education) VALUES (1011,'中山路21号',16,1,'Tom',{'游泳', '跑步'},['010-88888888','13888888888'],{'小学' : '城市第一小学', '中学' : '城市第一中学'}) ;

INSERT INTO student (id,address,age,gender,name,interest, phone,education) VALUES (1012,'朝阳路19号',17,2,'Jerry',{'看书', '电影'},['020-66666666','13666666666'],{'小学' :'城市第五小学','中学':'城市第五中学'});

8.TTL

添加TTL，设定的computed_ttl数值秒后，数据会自动删除
INSERT INTO student (id,address,age,gender,name,interest, phone,education) VALUES (1030,'朝阳路30号',20,1,'Cary',{'运动', '游戏'},['020-7777888','139876667556'],{'小学' :'第30小学','中学':'第30中学'}) USING TTL 60;

9.更新列数据

更新表中的数据，可用关键字：
--Where - 选择要更新的行
--Set - 设置要更新的值
--Must - 包括组成主键的所有列
在更新行时，如果给定行不可用，则UPDATE创建一个新行
    UPDATE <tablename>
    SET <column name> = <new value>
    <column name> = <value>....
    WHERE <condition>

9.1更新简单数据

UPDATE student set gender = 1 where student_id= 1012;

9.2更新set类型数据

使用UPDATE命令 和 ‘+’ 操作符
UPDATE student SET interest = interest + {'游戏'} WHERE student_id = 1012;

9.3删除一个元素

使用UPDATE命令 和 ‘-’ 操作符
UPDATE student SET interest = interest - {'电影'} WHERE student_id = 1012;

9.4删除所有元素

--可以使用UPDATA或DELETE命令，效果一样
UPDATE student SET interest = {} WHERE student_id = 1012;
或
DELETE interest FROM student WHERE student_id = 1012;
--一般来说，Set,list和Map要求最少有一个元素，否则Cassandra无法把其同一个空值区

9.5更新list类型数据

--1）使用UPDATA命令向list插入值
UPDATE student SET phone = ['020-66666666', '13666666666'] WHERE student_id = 1012;
--2）在list前面插入值
UPDATE student SET phone = [ '030-55555555' ] + phone WHERE student_id = 1012;
--3) 在list后面插入值
UPDATE student SET phone = phone + [ '040-33333333' ]  WHERE student_id = 1012;
--4) 使用列表索引设置值，覆盖已经存在的值
UPDATE student SET phone[2] = '050-22222222' WHERE student_id = 1012;
--5)【不推荐】使用DELETE命令和索引删除某个特定位置的值
--非线程安全的，如果在操作时其它线程在前面添加了一个元素，会导致移除错误的元素
DELETE phone[2] FROM student WHERE student_id = 1012;
--6）【推荐】使用UPDATE命令和‘-’移除list中所有的特定值
UPDATE student SET phone = phone - ['020-66666666'] WHERE student_id = 1012;

9.6 更新map类型数据

--map输出顺序取决于map类型。
--1）使用Insert或Update命令
UPDATE student SET education=
  {'中学': '城市第五中学', '小学': '城市第五小学'} WHERE student_id = 1012;
--2）使用UPDATE命令设置指定元素的value
UPDATE student SET education['中学'] = '爱民中学' WHERE student_id = 1012;
--3）可以使用如下语法增加map元素。如果key已存在，value会被覆盖，不存在则插入
UPDATE student SET education = education + { '幼儿园' : '大海幼儿园', '中学': '科技路中学'} WHERE student_id = 1012;

9.7删除元素

--使用DELETE删除数据
DELETE education['幼儿园'] FROM student WHERE student_id = 1012;
--使用UPDATE删除数据
UPDATE student SET education=education - {'中学','小学'} WHERE student_id = 1012;

9.8删除行

DELETE FROM <identifier> WHERE <condition>;

DELETE FROM student WHERE student_id=1012;

9.9批量操作

--把多次更新操作合并为一次请求，减少客户端和服务端的网络交互。 batch中同一个partition key的操作具有隔离性
--使用BATCH，您可以同时执行多个修改语句（插入，更新，删除）
BEGIN BATCH
<insert-stmt>/ <update-stmt>/ <delete-stmt>
APPLY BATCH

在批量操作中实现 3个操作：
--新增一行数据，student_id =1015
--更新student_id =1012的数据，把年龄改为11，
--删除已经存在的student_id=1011的数据，代码：

BEGIN BATCH
    INSERT INTO student (id,address,age,gender,name) VALUES (1015,'上海路',20,1,'Jack') ;
    UPDATE student set age = 11 where id= 1012;
    DELETE FROM student WHERE id=1011;
APPLY BATCH;

1.CQL Shell 客户端

2.cqlsh的基本命令

2.1DESCRIBE

2.2Capture

2.3show

3.数据定义命令

3.1 操作键空间

3.2连接Keyspace

3.3修改键空间

3.4删除键空间

4.操作表、索引

4.1创建表

4.2cassandra的索引（5种Key）

4.2.1Primary Key

4.2.2Composite Primary Key

4.2.3Partition Key

4.2.4Clustering Key

4.3修改表结构

4.4删除表

4.5清空表

5.创建索引

5.1普通列创建索引

5.2集合列创建索引

5.3 删除索引

6.查询数据

6.1查询所有数据

6.2查询时使用索引

示例

6.3查询时排序

6.4分页查询

7.添加数据

8.TTL

9.更新列数据

9.1更新简单数据

9.2更新set类型数据

9.3删除一个元素

9.4删除所有元素

9.5更新list类型数据

9.6 更新map类型数据

9.7删除元素

9.8删除行

9.9批量操作

添加附言

作者：123456789987654321

123456789987654321 的其他话题

分类下其他主题

随机推荐话题