Continue reading “Install Node.js, npm, and Angular on CentOS 7.x”
What is Apache Spark
You may have noticed, wherever there is a talk about big data the name Apache Spark eventually comes up, in simplest words it’s a large-scale data processing engine. Apache Spark is a fast data processing framework with provided APIs to connect and perform big data processing. Spark being the largest open-source data processing engine, has been adopted by large companies – Yahoo, eBay, Netflix, have massive scale Spark deployments, processing multiple petabytes of data on clusters of over 8,000 nodes.
Apache Spark can be started as a standalone cluster (which we’ll be doing for this tutorial), or using Mesos or YARN as cluster managers. Spark can work with data from various sources, AWS S3, HDFS, Cassandra, Hive (structured data), HBase, or any other Hadoop data source. Above all what makes Spark high in-demand is the included libraries MLib, SQL and DataFrames, GraphX, and Spark Streaming, to cater the main data processing use-cases, such that users can combinely use all these libraries in the same application.
Continue reading “Apache Spark and PySpark on CentOS/RHEL 7.x”
Jupyter notebooks are nice way to keep your code, diagrams, documentation together, mostly in a single file, which is also executable i.e. can run/interpret your code in it, and also have the result saved as it is. Here’s blogpost for installing Jupyter Notebook – today I’ll share how to use Ruby kernel with Jupyter Notebook i.e. executing Ruby code inside the notebooks.
To create notebooks that can execute Ruby code we need to integrate Ruby kernel, the 3 simple steps are:
- Install Jupyter
- Install Ruby
- Install iruby
Continue reading “Ruby Kernel for Jupyter Notebook”
Ruby is a dynamic, open source programming language with a focus on simplicity and productivity. Yukihiro “Matz” Matsumoto created it in the mid-1990s, using his influence from other prpgramming languages i.e. Perl, Ada, Lips, Eiffel, and Smalltalk. Ruby was released in 1995. Like Python (released few years earlier), ruby also has dynamic typing and implicit memory management
Continue reading “Install latest Ruby version using rbenv”
What is Jupyter Notebook
If you’re a Python developer, or someone who has to interact with Python, you may be hearing or seeing the term Jupyter Notebook quite lot, while reading articles, or looking for some solution on-line.
The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, machine learning and much more.
Continue reading “Install Jupyter Notebook”