`
yugouai
  • 浏览: 491645 次
  • 性别: Icon_minigender_1
  • 来自: 深圳
社区版块
存档分类
最新评论
文章列表
kafka、storm、zookeeper部署   1、 安装环境 zookeeper-3.4.8,apache-storm-0.9.3,jdk1.8.0_91,kafka_2.10-0.10.0.0 2、 配置kafka # The id of the broker. This must be set to a unique integer for each broker. broker.id=0 # The port the socket server listens on port=9092 # A comma seperated list of di ...

kafka

常用命令: val rdd1 = sc.parallelize(List(('a',1),('a',2))) val rdd = sc.textFile(“/usr/local/spark/tmp/char.data") rdd.count rdd.cache val word_count = rdd.flatMap(_.split(" ")).map((_,1)).reduceByKey(_+_) word_count.saveAsTextFile("/usr/local/spark/tmp/result") ...

Flume NG

Mongodb sink import java.io.IOException; import java.net.UnknownHostException; import java.util.ArrayList; import java.util.Date; import java.util.List; import java.util.Map; import org.slf4j.Logger; import org.slf4j.LoggerFactory; import com.cloudera.flume.conf.Context; import com.cl ...

Python map排序

Abstract This PEP suggests a "sort by value" operation for dictionaries. The primary benefit would be in terms of "batteries included" support for a common Python idiom which, in its current form, is both difficult for beginners to understand and cumbersome fo ...

mrunit测试

Map/Reduce 单元测试 About MRUnit world count 测试 package com.irwin.hadoop; import java.io.File; import java.io.IOException; import java.util.ArrayList; import java.util.List; import java.util.StringTokenizer; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; im ...

vertica概述

概述,如下图 

Yarn概述

如下图: 
from:http://blog.csdn.net/kongxx/article/details/38975539   将实时数据导入vertica时出现的错误,记录一下。 Caused by: java.sql.SQLException: [Vertica][VJDBC](5065) ERROR: Too many ROS containers exist for the following projections: public.action_log_super (limit = 39936, R ...

vertica优化

一、使用Database Desiger 1、创建查询sql,此sql为常用查询,vertica通过查询sql生成优化的PROJECTION,生成部分脚本,存放在设置过程中的       (1)多预测查询:常用sql语句 SELECT * FROM QUERY_PROFILES     (2)order操作     (3)数据分布算法(数据广播,数据切割) 2、通过查看query_events系统表查看需要优化的项      select * from QUERY_EVENTS     (1)主要查看event_type字段 PREDICATE OUTSI ...
Vertica 6.x 只支持R、C++编写Vertica的UDF,由于使用内置的String 提供的函数实现不了substring_index功能,且用C++实现   #include <algorithm> #include <string> #include "Vertica.h" using namespace Vertica; using namespace std; // 从text中获取第|delim_num|个|delim|前的全部子串 std::string getSubString(const std ...
Vertica实现mysql函数substring_index: package com.yy.vertica; import java.util.Arrays; import java.util.Collections; import java.util.List; import org.apache.commons.lang3.StringUtils; import org.apache.commons.lang3.math.NumberUtils; import com.vertica.sdk.BlockReader; import com.vertica. ...

JSP下载文件

JSP下载远程服务器文件,通过数据流的方式获取。后台代码如下: @RequestMapping public void requestDownlod(HttpServletResponse response, String filePath){ RemoteDownLoadHelper helper = new RemoteDownLoadHelper(); String resultFileAbsolutePath = helper.getTargetFilePath(filePath); List<File> fileList = FileUtil.l ...
package com.irwin.redis; import java.util.Arrays; import java.util.List; import org.junit.Test; import redis.clients.jedis.Jedis; import redis.clients.jedis.JedisPoolConfig; import redis.clients.jedis.JedisShardInfo; import redis.clients.jedis.Pipeline; import redis.clients.jedis. ...
克林顿 我要当个好人,娶个好老婆,养几个好孩子,交几个好朋友,做个成功的政治家,写一本了不起的书   T1:为什么要掌握时间 1、太忙 压力太大,感觉紧张,身体不舒服 2、太闲 无所事事,一直在退步,退化   掌 ...
下载源码   svn checkout http://code.taobao.org/svn/datax/trunk     -环境   root@datanode158:~# java -version java version "1.7.0_45" root@datanode158:~# python -V Python 2.7.3 root@datanode158:~# ant -version Apache Ant(TM) version 1.8.2 compiled on December 3 2011 root@datanode158 ...
Global site tag (gtag.js) - Google Analytics