What is Apache NiFi?
Apache NiFi is an open source project which enables the automation of data flow between systems, known as "data logistics". The project is written using flow-based programming and provides a web-based user interface to manage data flows in real time.
The project was created by the United States National Security Agency (NSA), originally named Niagarafiles. In 2014 the NSA released it as open-source software. Apache NiFi continued to be developed at Onyara, Inc., which was subsequently acquired by HortonWorks.
What Apache NiFi Does
Apache NiFi is an integrated data logistics platform for automating the movement of data between disparate systems. It is data source agnostic and supports sources of different formats, schemas, protocols, speeds, and sizes. Some common formats are geolocation devices, click streams, files, social feeds, log files, and more. NiFi provides a configurable plumbing platform for moving data, and enables tracing data in real time. It is not an interactive ETL tool. It can be part of an ETL solution.
Apache NiFi is designed from the ground up to be enterprise ready: flexible, extensible, and suitable for a range of devices from network edge devices such as a Raspberry Pi to enterprise data clusters and the cloud. Apache NiFi can also adjust to fluctuating network connectivity that could impact the delivery of data.
Apache NiFi Features
NiFi supports directed graphs of data routing, transformation, and system mediation. Features include:
- Web-based user interface - covering design, control, feedback, and monitoring.
- Highly Configurable - enables a balance between loss tolerance and guaranteed delivery, and low latency vs high throughput. Enables dynamic prioritization of flows, modification of flows at runtime, and back pressure thresholds, which specify amount of data that may exist in the queue, to avoid overrunning the system with data.
- Data Provenance - enables tracking data flows from beginning to end.
- Extensible - enables users to build their own processors and more. Enables rapid development and effective testing. Secure - supports SSL, SSH, HTTPS, encrypted content, and more. Provides multi-tenant authorization and internal policy management.