Blog

Apache Spark on z/OS

There is some rough calculation that Spark on z/OS pays off even when you transfer more than 150 GB of data from the z/OS to data warehouse on some other platform.
, 10.02.2017.

When my colleagues and me being members of mainframe community in CROZ heard about Apache Spark (about Spark you can read recent excellent blog from colleague Benšić) on mainframe we thought it is for Linux on z Systems. We heard about Apache Spark already and it makes sense that it works on Linux. So if it works on Linux why wouldn’t it work on Linux on z?!. It seemed that this is one of those open source new things (like Docker, MongoDB, Node.js, …) that they also put on Linux on z. But when we realized it is Apache Spark for z/OS things got interested suddenly. Spark on z/OS? Why, how? Well, here it is.

Why?

Because it is getting more and more interesting to do analytics on live transactional data and most of such data in the world resides on mainframe and especially on z/OS managed by DB2 and IMS/DB. Today they also call it ‘systems of records’. So, you have your analytics on the very place the data is created and modified without the need to extract, transform and load the data into large data warehouses which are often placed outside mainframe. Let us mentioned just some of the advantages: analytics results in real time, security and resiliency of the platform, data collocation. Also it is important to note that Spark on z/OS works on data from various sources like DB2, IMS, VSAM, ADABAS, … and it can even work on data residing outside the mainframe!

How?

Well, more precise question would be ‘how and not too expensive?’. The answer is in the fact that most of Spark workload (>90%) is executed on zIIP processors which utilization does not affect MLC. Additionally to processor power you will also need some memory. Spark on z/OS benefits from many specific features of z/OS like WLM, large page support, SMT2. It integrates with RACF or some other security platform and it can be monitored by RMF or Spark Web UI.

Apache Spark for z/OS is for free but if you want to have support from IBM you have to pay for it. If you need to buy additional zIIP processors and memory for Spark on z/OS IBM announced some very attractive discounts. Given that there is some rough calculation that Spark on z/OS pays off even when you transfer more than 150 GB of data from the z/OS to data warehouse on some other platform.

If you see yourself in the story above and you want to see if Spark on z/OS is right thing for you CROZ experts with their Spark knowledge and extensive experience on mainframe are here to help you.

Tags:
Return