编译
环境
- Hadoop 3.0.0-cdh3.6.1
- Hive 2.1.1-cdh3.6.1
建议
使用mvn clean install -DskipTests
直接编译安装,方便后续修改代码打包模块
可能遇到的问题
FileNotFoundException: File file:/root/.flink/application_xxxxxxxxxxx/
原因: 这是由于cdh版本的hadoop和开源hadoop的配置文件位置不同导致的,在chunjun的启动脚本submit.sh中获取hadoop配置文件是直接使用$HADOOP_HOME/etc/hadoop
# if HADOOP_HOME is not set or not a directory, ignore hadoopConfDir parameter
if [ ! -z $HADOOP_HOME ] && [ -d $HADOOP_HOME ];
then
echo "HADOOP_HOME is $HADOOP_HOME"
PARAMS="$PARAMS -hadoopConfDir $HADOOP_HOME/etc/hadoop"
else
echo "HADOOP_HOME is empty!"
fi
解决方案: 启动命令加上-hadoopConfDir $HADOOP_CONF_DIR
或 直接修改chunjun启动脚本
No FileSystem for scheme "hdfs"
需要在$FLINK_HOME/lib/目录下加入flink-shaded-hadoop-x jar包,下载地址如下
- https://repo1.maven.org/maven2/org/apache/flink/flink-shaded-hadoop-2/2.7.5-10.0/flink-shaded-hadoop-2-2.7.5-10.0.jar
- Maven Repository: org.apache.flink » flink-shaded-hadoop-3-uber (mvnrepository.com)
Required field 'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null, configuration:{use:database=default})
原因: hive版本依赖冲突
解决方案:
最外层pom文件中添加cdh的仓库地址
<!--仓库中添加cdh的maven地址-->
<repositories>
<repository>
<id>cloudera-repo-releases</id>
<url>https://repository.cloudera.com/artifactory/public/</url>
<releases>
<enabled>true</enabled>
</releases>
<snapshots>
<enabled>false</enabled>
</snapshots>
</repository>
</repositories>
在chunjun-connector-hive模块的pom文件中修改hive版本
<!--修改hive版本-->
<groupId>org.apache.hive</groupId>
<artifactId>hive-jdbc</artifactId>
<version>2.1.1-cdh6.3.1</version>
<groupId>org.apache.hive</groupId>
<artifactId>hive-serde</artifactId>
<version>2.1.1-cdh6.3.1</version>
<groupId>org.apache.hive</groupId>
<artifactId>hive-exec</artifactId>
<version>2.1.1-cdh6.3.1</version>
在chunjun-connector-hdfs模块的pom文件中修改hive版本
<hive.version>2.1.1-cdh6.3.1</hive.version>
cannot be cast to com.google.protobuf.Message
原因: hadoop版本依赖冲突
解决方案: 排除hadoop-common和hadoop-client依赖
最外层pom文件中注释以下代码(否则修改代码会设计认证问题 无法打包)
<!--<licenses>-->
<!-- <license>-->
<!-- <name>The Apache Software License, Version 2.0</name>-->
<!-- <url>https://www.apache.org/licenses/LICENSE-2.0.txt</url>-->
<!-- <distribution>repo</distribution>-->
<!-- </license>-->
<!--</licenses>-->
<!--<plugin>-->
<!-- <groupId>com.diffplug.spotless</groupId>-->
<!-- <artifactId>spotless-maven-plugin</artifactId>-->
<!--</plugin>-->
<!--<plugin>-->
<!-- <groupId>com.diffplug.spotless</groupId>-->
<!-- <artifactId>spotless-maven-plugin</artifactId>-->
<!-- <version>2.4.2</version>-->
<!-- <configuration>-->
<!-- <java>-->
<!-- <googleJavaFormat>-->
<!-- <version>1.7</version>-->
<!-- <style>AOSP</style>-->
<!-- </googleJavaFormat>-->
<!-- <!– \# refers to the static imports –>-->
<!-- <importOrder>-->
<!-- <order>com.dtstack,org.apache.flink,org.apache.flink.shaded,,javax,java,scala,\#</order>-->
<!-- </importOrder>-->
<!-- <removeUnusedImports/>-->
<!-- </java>-->
<!-- </configuration>-->
<!-- <executions>-->
<!-- <execution>-->
<!-- <id>spotless-check</id>-->
<!-- <phase>validate</phase>-->
<!-- <goals>-->
<!-- <goal>check</goal>-->
<!-- </goals>-->
<!-- </execution>-->
<!-- </executions>-->
<!--</plugin>-->
在chunjun-connector-hive模块的pom文件中排除hadoop依赖
<groupId>com.dtstack.chunjun</groupId>
<artifactId>chunjun-connector-hdfs</artifactId>
<version>${project.version}</version>
<exclusion>
<artifactId>hive-serde</artifactId>
<groupId>org.apache.hive</groupId>
</exclusion>
<exclusion>
<artifactId>hive-exec</artifactId>
<groupId>org.apache.hive</groupId>
</exclusion>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs-client</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
</exclusion>
<groupId>org.apache.hive</groupId>
<artifactId>hive-jdbc</artifactId>
<version>2.1.1-cdh6.3.1</version>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs-client</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-yarn-registry</artifactId>
</exclusion>
<groupId>org.apache.hive</groupId>
<artifactId>hive-serde</artifactId>
<version>2.1.1-cdh6.3.1</version>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs-client</artifactId>
</exclusion>
<groupId>org.apache.hive</groupId>
<artifactId>hive-exec</artifactId>
<version>2.1.1-cdh6.3.1</version>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs-client</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-yarn-server-resourcemanager</artifactId>
</exclusion>
在chunjun-connector-hdfs模块的pom文件中排除hadoop依赖(包含在hive-common中)
<groupId>org.apache.hive</groupId>
<artifactId>hive-exec</artifactId>
<version>${hive.version}</version>
<exclusion>
<artifactId>hive-common</artifactId>
<groupId>org.apache.hive</groupId>
</exclusion>
<groupId>org.apache.hive</groupId>
<artifactId>hive-serde</artifactId>
<version>${hive.version}</version>
<exclusion>
<artifactId>hive-common</artifactId>
<groupId>org.apache.hive</groupId>
</exclusion>
Invalid signature file digest for Manifest main attributes
原因: META-INF中的文件没有排除干净 涉及认证问题
解决方案: 在chunjun-connector-hdfs模块的pom文件中修改成以下代码
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<artifactSet>
<excludes>
<exclude>org.slf4j:slf4j-api</exclude>
<exclude>log4j:log4j</exclude>
<exclude>ch.qos.logback:*</exclude>
</excludes>
</artifactSet>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
</configuration>
</execution>
</executions>
</plugin>
Comments NOTHING