27. Data governance of yiee data operation system of Duoyi Education - atlas deployment and use


1, atlas Compilation and packaging

2, atlas installation configuration

1. Compilation environment

2. Compilation steps

3. Installation steps

4. Hive hook configuration

5. Operation test

3, atlas configuration hive hook

4, Introduction to atlas

        1,Base Search

        2,Advanced Search

3. Create Entity



6. Lineage view

1, atlas Compilation and packaging

First, download the source package on the official website

Upload to linux, unzip
[root@h2 ~]# tar -zxf apache-atlas-2.0.0-sources.tar.gz -C /opt/app/

Enter the source directory, compile and package maven
mvn clean -DskipTests package -Pdist,embedded-hbase-solr

After the compilation, the packaging result will be generated, which is located in the new distro/target directory in the source directory

2, atlas installation configuration

1. Compiling environment

Environment with maven-3.6.3 and above

Configure domestic image source for maven, VI $m2_home / conf / settings.xml

    <name>aliyun maven</name>

2. Compilation steps

Upload the installation package to linux
Modify the dependent version and related package download address
atlas parent project pom file

distro engineering pom file 


Perform maven compilation and packaging
Note that atlas can use embedded hbase solr as the underlying index storage and search component, or external hbase and solr

If you want to use the embedded HBase Solr, use the following command to compile and package

cd /opt/atlas2.0 
export MAVEN_OPTS="-Xms2g -Xmx2g"
mvn clean -DskipTests package -Pdist,embedded-hbase-solr

Depending on the network speed, wait patiently, and try again and again for several times. It's better to open a vpn with good speed

3. Installation procedure

Move out the atlas compiled installation package

mv distro/target/apache-atlas-2.0.0/ /opt/app/

Start atlas

cd /opt/app/
cd apache-atlas-2.0.0/

Then access 21000 port and find 503 error, shit!
Kill the atlas process and start the solr service manually

cd apache-atlas-2.0.0/solr/
bin/solr start -c -z localhost:2181 -p 8984 -force

Creating an initialization index library for solr

bin/solr create -c vertex_index -shards 1 -replicationFactor 1 -force

Then open a browser to access solr's web services as follows, and solr starts successfully

Restart atlas again

[root@h1 apache-atlas-2.0.0]# bin/atlas_start.py 

The Server is no longer running with pid 102331
configured for local hbase.
hbase started.
configured for local solr.
solr started.
setting up solr collections...
starting atlas on host localhost
starting atlas on port 21000
Apache Atlas Server started!!!

Access port 21000 of h1
Then, it was found that it might be 503

Check the atlas service log
The following error messages are found:

org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at Can not find the specified config set: fulltext_index
        at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:627)
        at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:253)
        at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:242)
        at org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:4

Repeat the previous steps to create the index library prompted in error reporting
Then restart atlas again, access the port, and finally it's finished

In conclusion, the installation of atlas is really troublesome, and the software is not perfect!

4. Hive hook configuration

Modify hiv-env.sh

export HIVE_AUX_JARS_PATH=/opt/app/apache-atlas-2.0.0/hook/hive

Modify hive-site.xml


5, Run test

Start hive and create a library

hive> create database atlasdemo;
Time taken: 0.267 seconds

Search atlas for the library you just created

Perfect! atlas deployment successful

3, atlas configuration hive hook

After configuring the hook of hive, any operation in hive will be sensed by the hook, and the corresponding atlas metadata will be generated and sent to atlas for storage management;

However, before atlas installation, hooks will not automatically generate related metadata for the existing tables in hive;
You can use a tool of atlas to import metadata from existing hive libraries or tables;

Usage 1: <atlas package>/hook-bin/import-hive.sh

Usage 2: <atlas package>/hook-bin/import-hive.sh [-d <database regex> OR --database <database regex>] [-t <table regex> OR --table <table regex>]

Usage 3: <atlas package>/hook-bin/import-hive.sh [-f <filename>]

4, Introduction to atlas

Apache Atlas UI function details
Apache atlas UI features include three parts: SEARCH, CLASSIFICATION and GLOSSARY

The Search module includes Base Search, Advanced Search and Entity creation.

1,Base Search

The query criteria are Type, Classification, Term, and Text. You can also save common query criteria combinations.

2,Advanced Search

Query conditions include: Term and query. You can also save common query criteria combinations.
Commonly used Term values: Asset, avro_collection, avro_enum, avro_field, avro_fixed, avro_primitive, avro_record, avro_schema, avro_type

3. Create Entity


Classification module includes: Classification List (flat structure, tree structure), classification creation function.
Flat structure, as shown in the figure below:

Tree structure, as shown in the figure below

Create a Classification, as shown in the following figure:

Atlas WebUI Tags adds custom classification tags, including fact table Tag, dim table dimension table Tag and AGG table aggregation table Tag.

Then add the corresponding Tags to each Hive table. The dimension table Tags of the added results are as follows.


The glossary module includes: Glossary list query (Term, Category), creating glossary.
Query the Glossary list, as shown in the following figure:

Create Glossary, as shown in the following figure:

6. Lineage view

After the Atlas WebUI searches for a table, you can see the Lineage of the table, such as the agg · monthbrandsalesamount table created above.

Published 36 original articles, won praise 2, visited 3936
Private letter follow

Tags: solr hive Apache HBase

Posted on Tue, 11 Feb 2020 03:02:02 -0500 by bensonang