Amundsen

Data discovery and metadata engine for improving the productivity when interacting with data

Amundsen은 데이터 애널리스트, 데이터 과학자 및 엔지니어가 데이터와 상호 작용할 때 생산성을 향상시키기 위한 데이터 검색 및 메타데이터 엔진입니다. 현재 데이터 리소스 (테이블, 대시 보드, 스트림 등)를 인덱싱하고 사용 패턴을 기반으로 페이지 순위 스타일 검색을 제공하여 이를 수행합니다 (예 : 쿼리 수가 많은 테이블이 적은 테이블보다 먼저 표시됩니다). 데이터의 구글 검색 역할을 하는 것으로 생각할 수 있습니다. 이 프로젝트는 남극에서 첫 번째로 발견한 노르웨이 탐험가 Roald Amundsen의 이름을 따왔습니다.

Installationarrow-up-right

Bootstrap a default version of Amundsen using Dockerarrow-up-right

The following instructions are for setting up a version of Amundsen using Docker.

  1. Make sure you have at least 3GB of disk space available to Docker. Install docker and docker-compose.

  2. Clone this repoarrow-up-right and its submodules by running:

    $ git clone --recursive https://github.com/amundsen-io/amundsen.git
  3. Enter the cloned directory and run the command below:

    # For Neo4j Backend
    $ docker-compose -f docker-amundsen.yml up
    
    # For Atlas
    $ docker-compose -f docker-amundsen-atlas.yml up

    If it’s your first time, you may want to proactively go through troubleshootingarrow-up-right steps, especially the first one related to heap memory for ElasticSearch and Docker engine memory allocation (leading to Docker error 137).

  4. Ingest provided sample data into Neo4j by doing the following: (Please skip if you are using Atlas backend)

  5. In a separate terminal window, change directory to databuilderarrow-up-right.

  6. sample_data_loader python script included in examples/ directory uses elasticsearch client, pyhocon and other libraries. Install the dependencies in a virtual env and run the script by following the commands below. See Windows Troubleshootingarrow-up-right if you encounter an error on python3 setup.py install regarding extas_require on windows.

     $ python3 -m venv venv
     $ source venv/bin/activate
     $ pip3 install --upgrade pip
     $ pip3 install -r requirements.txt
     $ python3 setup.py install
     $ python3 example/scripts/sample_data_loader.py
  7. View UI at http://localhost:5000arrow-up-right and try to search test, it should return some result.

  8. We could also perform an exact-match search for the table entity. For example: search test_table1 in table field and it’ll return the records that matched.

Atlas Note: Atlas takes some time to boot properly. So you may not be able to see the results immediately after you run the docker-compose up command. Atlas would be ready once you’ll have the following output in the docker output Amundsen Entity Definitions Created...

참고자료

설치 예제 블로그 👍

Last updated