关系型数据库SQL--MySQL和Orcale
基本查看--查看数据库。表名和表结构-表数据和总数
show status
show databases;
use your_database;
show tables;
DESC your_table;
查询单列或者多列
select user_birthday from user
查看行以及行数
select * from your_table limit 3;
计数:select count(*) from your_table
排序
1.只排一列
select user_birthday from user order by user_birthday;
select user_birthday from user order by user_birthday DESC;
2.多列排序,升序ASC是默认的
select * from user order by user_id,user_birthday;
select * from user order by user_id DESC,user_birthday;
select * from user order by user_id ,user_birthday DESC;
筛选以及过滤
过滤行以及查找 where in not like
select user_birthday from user where id>3;
select user_birthday from user where id like '3%';
去重
select distinct user_birthday from user
字符串和数值操作
拼接:select name||id as tiele from tableA ;
求和:select sum(field1) as sumvalue from tableA
平均:select avg(field1) as avgvalue from tableA
最大:select max(field1) as maxvalue from tableA
最小:select min(field1) as minvalue from tableA
分组和汇总
select count(*) from tableA group by sex;
select id,count(*) from tableA group by sex;
过滤分组
select id,count(*) from tableA group by sex having count(*)>2;
嵌套查询
select name,sex,id from tableA where id in (select id from tableB where id >3)
联结查询--以列为单位对表进行联结
select name,sex,id from tableA ,tableB where tableA.id = tableB.id;
select name,sex,id from tableA inner join tableB on tableA.id = tableB.id;
select name,sex,id from tableA ,tableB,tableC where tableA.id = tableB.id and tableC.hot = tableB.hot;
组合查询--以行位单位对表进行操作
select name,sex,id from tableA union select name,sex,id from tableB where tableB.id>3;
select name,sex,id from tableA union all select name,sex,id from tableB where tableB.id>3;
使用索引以及存储过程
说明
join 是两张表做交连后里面条件相同的部分记录产生一个记录集,
union是产生的两个记录集(字段要一样的)并在一起,成为一个新的记录集
帮助系统
help
help index
?index
退出
exit
注意:
MySQL和Orcale有不一样的地方
除了多表关联查询,子查询以及表集合查询外,还有层次查询
Hive中基本的数据分析型语句
HQL---进入Hive中类 SQL 查询语言-关系型分析查询作业
进入
su hdfs
hbase shell
查看已存在的表
show databases;
show tables;
SHOW TABLES 'page.*';
查看表中有多少分区:
show partitions logs;
查看invites的表的结构
DESCRIBE invites;
查看部分行--查询的结果是随机选择的
SELECT foo FROM invites limit 3;
查看分区部分行
SELECT a.foo FROM invites a limit 3;
查看行以及行数
SELECT * FROM invites limit 3;
计数 select count(*) from invites;
排序
在分布式中的排序有所不同---全局排序和reduced有关
order by
sort by
distribute by
cluster By
筛选以及过滤
SELECT a.foo FROM invites a WHERE a.ds='2008-08-15';
去重查询
Hive上一个典型表内除重的写法
select ad ,sum(plus),count(distinct name,id) from invites;
分组和汇总
SELECT a.bar, count(*) WHERE a.foo > 0 GROUP BY a.bar;
SELECT year(ymd), avg(price_close) FROM stocks WHERE exchange = 'NASDAQ' AND symbol = 'AAPL' GROUP BY year(ymd) HAVING avg(price_close) > 50.0;
联结
SELECT sales.*, things.* FROM sales JOIN things ON (sales.id = things.id);
SELECT sales.*, things.* FROM sales LEFT OUTER JOIN things ON (sales.id = things.id);
帮助系统
显示所有函数:
show functions;
查看函数用法:
describe function substr;
退出
exit;
Spark SQL
进入
spark-sql
支持Hive的查询命令
SELECT
GROUP BY
ORDER BY
CLUSTER BY
SORT BY
All Hive operators, including:
Relational operators (=, ⇔, ==, <>, <, >, >=, <=, etc)
Arithmetic operators (+, -, *, /, %, etc)
Logical operators (AND, &&, OR, ||, etc)
Complex type constructors
Mathematical functions (sign, ln, cos, etc)
String functions (instr, length, printf, etc)
User defined functions (UDF)
User defined aggregation functions (UDAF)
User defined serialization formats (SerDes)
Window functions
Joins
JOIN
{LEFT|RIGHT|FULL} OUTER JOIN
LEFT SEMI JOIN
CROSS JOIN
例子
SELECT people.name FROM people join parquetFile where people.name=parquetFile.name
Unions
Sub-queries
SELECT col FROM ( SELECT a + b AS col from t1) t2
Hbase
查看基本信息
version
status
whoami
查看有哪些表
list
describe 'table'
查数据内容:
scan 'my_test',{LIMIT => 5}
get 'my_test','rowkey001'
查看数据量--在数据量相对较小的时候用
count 'tablename', {INTERVAL => 100, CACHE => 500}
退出
exit