SHC:使用 Spark SQL 高效地读写 HBase(转载)

2019-07-11

Apache Spark 和 Apache HBase 是两个使用比较广泛的大数据组件。很多场景需要使用 Spark 分析/查询 HBase 中的数据,而目前 Spark 内置是支持很多数据源的,其中就包括了 HBase,但是内置的读取数据源还是使用了 TableInputFormat 来...

Spark FastJson 解析SDK上报日期

2019-04-17

日志格式为 {"CommonInfo":{"env":"iOS","productId":"JiaTuiAPP","userInfo":{"userId":"561239948443779072"},"systemInfo":{"system":"12.2","platform":"iOS","model":"iPhone 5s","pixelRatio":2,"brand":"iPhone","screenWidt...

通过BulkLoad快速将海量数据导入到Hbase

2018-12-21

原始文件从mysq导出来的csv文件 503003676755886086,503003161271734273,1 503003669797548035,503003161271734273,1 503003568609964035,503003161271734273,1 503003700512428038,5030031612717...

OOZIE 任务调度使用及详解

2018-12-14

job.properties nameNode=hdfs://dev-bg-m01:8020 jobTracker=dev-bg-m01:8050 queueName=default oozie.use.system.libpath=true #oozie.libpath=/user/dmp_operator1/share/libs jdbcURL=jdbc...

Spring mvc 框架定时刷新kerberos认证票据

2018-12-01

package com.XXX.counter.listener; import javax.servlet.ServletContextEvent; import javax.servlet.ServletContextListener; import java.util.Timer; public class TicketScanerListener...

JAVA 远程上传文件HDFS+kerberos认证

2018-10-19

package com.jiatui.bigdata; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.*; import org.apache.hadoop.security.UserGroupInformation; /** * 读取文件...

Spark1.X操作DataFrame示例

2018-10-12

{"id":1, "name":"Ganymede", "age":32} {"id":2, "name":"Lilei", "age":19} {"id":3, "name":"Lily", "age":25} {"id":4, "name":"Hanmeimei", "age":25} {"id":5, "name":"Lucy", "age...