My name’s Eldee Stephens. I am the Nest Bring
Up Lead and part of the processor design group for IBM z Systems. And today I want to talk
a little bit about the newest member of the IBM z13 family that we just introduced a couple
weeks ago, the IBM z13s, and walk you through processors and I/O and crypto and operating
systems updates and all that fun stuff. So if you’ve all been drowning in a treacle of
marketing data for the past three days, we’re going to get really techy here. But before I start, I did want to talk to
a little bit of the folks – I don’t know how many of you all are all mainframers or
not but I wanted to talk a little bit about the reasons why for the past fifty years the
IBM mainframe has been the cornerstone of the IT infrastructure for the world’s largest
corporations and institutions. As an engineer, it’s amazing to meet customers in all these
different fields – energy, transportation, logistics, banking, finance, railways in China,
steel foundries in Poland, big retail, you name it – z has been at the center of their
IT infrastructure for decades. IBM internal studies suggest that upwards of 80% of the
world’s corporate data is either housed or processed on IBM mainframes to this day. And I think the reason why the mainframe has
remained as relevant as it has is for a few reasons. The first one is it is by far the
most secure computing platform in the world. It’s the only one that’s the AL5-plus compliant.
We have security everywhere from the Silicon level all the way up through the software
stack. There’s cryptography everywhere, we have tremendous mechanisms for authentication
of users, very finely grained auditing. Our crypto stuff has the best certifications in
the world. They always support the newest cryptography standards. Another reason is these systems are by far
the most reliable commercial computers in the world. Everything is redundant – everything.
We really follow an N-plus-1 design philosophy where if you’re talking about power supplies
and fans and coolant hoses and optical fiber and interconnect between the I/O cages, array
cells on the processor caches or memory cards; we have the ability to always detect errors,
correct them, if possible. If we can’t correct them, we can recover. And God forbid there’s
something completely catastrophic. We always maintain our client’s data integrity. And thirdly is just the massive scale and
capacity of the mainframe whether you’re talking about a compute capacity or storage capacity,
I/O connectivity, I/O bandwidth. These systems have the ability to run massive amounts of
workloads for massive numbers of users, driving utilization levels up to 100% for extended
periods, all with tremendous virtualization capabilities, as well. I think the fourth reason is because every
single time we introduce a new generation; us, the engineers, we always try to reinvent
the platform a little bit to make it relevant. And we cherry-pick from the greatest innovations
from our research laboratories in computer science, mathematics, and physics to try to
make z harness upcoming IT trends so that our clients can drive business growth and
drive customer satisfaction. For the z13 family, we introduced the high-end
z13 about a year ago at this time and the z13- that’s what we’ll talk about today – just
a few days ago, there were really four main goals. The first one to make certain that
we could handle the explosion of transaction volume because of the pervasive use of these
guys – mobile devices and the apps that come along with it – as well as a tremendous
amount of censor data from the members of the Internet of Things. And this is particularly
important to our industrial customers. The second area that we really focused on
was adding new compute technologies to the processors. To bring in the ability to run
more technical workloads into our system, really focusing on things like analytics in
transaction, predictive analytics. And we’ll talk a little bit about what that means later
on. The third area was security. Just being the
leader in computing security is not good enough. We see every single day that the cyber threat
environment is changing. I can’t tell you how many letters I’ve gotten from people who
tell me that, “My identity’s been stolen and that’s the reason why there’s a charge
from some hardware store on my credit card that I didn’t ask for.” And then fourth, we had a really big push
to integrate a lot of open source technologies and open up our software through open source
APIs so that we can bring in developers who might not be as familiar with the mainframe
to be able to interact with enterprise level, back ends, middleware, and software. The z13s is basically taking all of those
ingredients that we started with with the z13 plus a few more and introducing it into
a smaller form factor and a lower cost of entry. Compared with its predecessor, it has
more cores – up to twenty available for the user. That doesn’t include system assist
processors and so forth. It has much more cache, right? More than 2 gigabytes of cache.
It’s got more memory than a lot of people’s – has more cache memory than a lot of people’s
PCs have memory. It has new capabilities in the processor including vector processing
and simultaneous multithreading – much, much, much more memory. We’re going to talk
about the benefits of big memory in a little bit – upwards of 4 terabytes. That’s eight
times what we had on the zBC12. We’ve got a lot more I/O options and a lot
better I/O performance with PCIe Generation 3 technology. We really focused on LINUX.
Big improvement in LINUX capacity with this machine. We’re bringing the world’s best disaster
recovery solutions – GDPS – to LINUX. We’re introducing Dynamic Partition Manager
to bring ZMB-like capabilities to LINUX-only deployment. And we’ve got a lot of new security
technologies, as well. When you add up all those ingredients, you
get 40% more capacity in a z/OS perspective. You’ve got more than twice the total LINUX
on z capacity, 50% improvement for Java workloads with SIMD and SMT and actually, that gets
a little bit better. If your Java workloads scale nicely with multiple threads, the performance
is anywhere between 60% and 70%. The cryptographic coprocessor on the processor’s brand new design
and you can see anywhere between two and three and a half times more throughput depending
on the encryption standard that you’re using. More I/O bandwidth, more channel speed, more
memory, more cache. And what this means is that you can do more, run more at lower cost
and introduce new workloads that you never would’ve imagined doing on the mainframe before.
And that’s really, I think, the key here that we’re wanting to focus on. We want to end
this division between the concept of having systems of record and systems of engagement,
right? Where you have all your data on the mainframe but you’ve got your customer-facing
applications running somewhere else. We want to bring all that onto the mainframe and I’m
going to show you some of the things that we’re doing to make that happen. And when it comes to the concept of performance,
you know, what we’re trying to do here is we don’t want to just grow computer density,
right, by offering more course per die. We don’t want to just increase the amount of
I/O connectivity. We don’t want to just increase the amount of storage, we don’t want to just
increase the I/O bandwidth; although I’d really like that number, 315 gigabytes a second in
a single frame. That’s enough I/O bandwidth to make an Intel guy weep. So what we try to do is we’ve tried to look
at the total performance picture and everywhere we could, we wanted to turn up each dialup
to eleven, make Spinal Tap proud. Alright. I’d like to show you actually what
this thing looks like when you take the cover off. And by the way, that cover is not just
pretty. It actually has a lot of sound-deadening material in it, as well as helping with airflow.
We can talk about that later. But we offer the z13s in two main models – the N10 and
the N20. This is based on the number of user available cores. The N20 is a little interesting
because you can have it where you only have a single drawer, you can have two drawers.
You can have two drawers where all the cores are on one and the second one is primarily
for I/O connectivity. So there’s a lot of flexibility. And of course, all those cores can be characterized
as central processing engines for z/OS, IFLs for LINUX, coupling facilities, zIIPs – zapas
doesn’t exist anymore so if you’ve got a qualified workload like XML parsing for DB2 or Java
or so forth, you can run it on zIIP and you don’t get that permits charge. If you have
no idea what I mean by zIIPs and ZAPs, don’t worry about, we’ll talk about it later. One of the great things about this is just
the granularity of capacity that’s available to the customer. Capacity on demand has always
been a big mainframe thing, right? You can go ahead and purchase a configuration but
it has more hardware in the box, you don’t get charged for it. And as your business starts
to grow and you need more capacity, you need more processors, you need more storage, you
need more I/O; you just go to a web page, fill out a form, press a button, and it’s
there. And we can also do it based on workload spikes, right? So maybe you’ve got a cyber
Monday type situation and you know over the next couple of days that your demands are
going to spike up. There’s no need to actually buy all that capacity for the long term if
you don’t need it. If you want to do it temporarily, we can do that and we can do it on a very
fine granularity. Like I said, upwards of 4 terabytes of memory
– RAIM memory, most reliable in the world. I happen to be the logic designer behind that.
And the I/O subsystem here has much better performance and much better bandwidth. We’ve
got dual PCIe generation 3, I/O controllers on the processors dies. This is an air-cooled
system, by the way, single frame like other business class systems before. If we start
from the top on down, you’ve got your support elements. These are now actually in one new
high rack servers – redundant, of course, right? We don’t want a situation where something
fails. Seamless transition in the case of one of those things failing. And the support
element is where you’re going to define all of your partitions, it’s where you’re going
to set up your I/O configuration, IPL operating systems, interact with your cryptography subsystem,
etc. We also offer as an option the integrated
backup facility. These are really big honking batteries like big UPSs for your server. Not
everybody has a data center where you’ve got a fuel-oiled generator in the back. So you
want to have the ability in the event of a catastrophic power loss to have the time to
gracefully shut down your workloads. Or alternatively, if your z13 just happens to be part of a geographically
disbursed parallel Sysplex environment, you can migrate those workloads over to the remote
site. Bulk power supplies. Underneath that, there’s
a power coupling for a 480 volt, three-phase power-up in the front and the back. Underneath
that, you’ve got your system control hub and then the meat of the machine, your central
processor complex. This is where you’ve got your processors, your memory, your system
controllers, all your fan-outs, your I/O cages. Speaking of I/O cages, you know, upwards of
two PCIe I/O drawers. Each I/O drawer can contain thirty-two PCIe generation 3 cards.
And then at the bottom, we have a Legacy I/O drawer for all of your older cards like for
coupling and so forth you may still have around. They’re Infinity band based. Going around the back, I just wanted to point
out to you here that we’ve got, as always, multiple DCAs for your processor drawers in
case of failure, multiple fans so that if a fan fails, you still have enough cooling.
And these oscillator cards here, dual again, N plus 1 design for reliability. And we’re
constantly monitoring the output of those things. That’s your reference clock, right?
Those clocks are feeding all the processors and electronics throughout the system. If
that thing goes wonky, you’ve got real problems, right? So we’re constantly looking at long
term clock jitter, clock skewing, things like that. Anytime something gets out of whack,
instant failover. Okay. The processor drawers themselves are
actually split up in the logically, two different nodes and I’ll show you how the fabric all
looks a little bit later. But each node – and there’s two per drawer – has two processor
chips and you can have upwards of six or seven cores in each one of those processor chips.
Your system controller, which does all your fabric communications and has your L4 cache.
Upwards of ten DIMMS, we’ll talk about the memory options here in a little bit. And in
the front, this is your fan-out here. So your service processors; redundant, of course,
and enough fan-out cards so your PCIe cages so that you can have redundant interconnect
for reliability. In the center is the fan-out to your Infinity band Legacy cards. Or alternatively,
you can plug in a new short range coupling card solution that we’ll talk about here in
a little bit. Alright, I’d like to talk through some of
the processor elements now. This is the exact same processor that came with the z13. It’s
an eight-core design, 22 nanometer technology, seventeen layers of metal, nearly four billion
transistors, 13.7 miles of copper wire. You can have upwards of six to seven active cores
per processor chip on a z13s running at 4.3 gigahertz. Each core has L1 and L2 private
caches split for both instruction and data. And then, they all share a nice big 64 megabyte
L3 cache that uses our imbedded DRAIM technology. And by the way, for reliability, we’ve got
lots of redundancy on those caches – scrubbing engines constantly looking for errors and
so forth. The big introduction with this – and we’ll
talk about those in a second – are single instruction multiple data for vector processing
and the introduction of simultaneous multithreading. Now, that little acronym, those three letters,
was a huge impact of the design. It’s one of the reasons why this core is almost a ground-up
redesign. And we’ll talk through that in a minute. We have a native PCIe Generation 3 I/O controllers,
two of them, plus your Legacy GX controller and your memory controllers that talk to the
RAIM memory subsystem. Okay. Remember earlier that I was telling
you about the fact that what we really want to do is we want to increase compute capacity
here and we want to make z a better platform for technical workloads, as well as the traditional
mixed commercial workloads? Well, there’s sort of three layers of that this time around.
The first one is simultaneous multithreading. And it’s exactly what it sounds like – the
ability to run multiple threads within a given core, sharing all the core’s resources. This
is a lot harder than it sounds, right? Each one of these threads has to have separate
registers, state data, etc. And being able to dispatch and control and execute multiple
threads in the core, sharing all that resources, that’s not an easy thing to do while maintaining
increases in single thread performance. We’ve got a lot more parallelism in the core,
a lot more. We can actually complete six instructions per clock, issue up to ten instructions per
clock. We’ve got four fixed point execution units, four floating point execution units.
And then the big one here, particularly for analytics, is vector processing – single
instruction multiple data. We’ll talk about that here in a little bit. There’s a lot of
additions to the ISA for z that’s associated with that and a lot of really cool hardware
on the processor to make this happen. So simultaneous multithreading is very, very,
very helpful but it’s only really helpful for certain types of workloads. There’s a
lot of workloads out there that don’t really scale well with multiple threads. So you want
to have the granularity, depending on what kind of workloads you’re running, to either
run in a high speed single-threaded mode or, for throughout purposes. For things like certain
Java workloads that do scale well on multiple threads, you can enable SMT. And for those
types of workloads, the performance improvement compared with the previous system is substantial
– between 60% and 70%. We’ve got SMT integrated throughout the software stack and it’s supported
by all of our operating systems – VM, LINUX, z/OS. Prism understands at the LPAR level
– completely transparent, by the way to your application so you, developers, don’t
have to think about this, right? It’s all sort of baked in. And like I said, it’s completely
supported throughout the software stack. The application there, the middleware layer
and the tool chains have been updated tremendously. And over time, you’re going to see more and
more workloads take advantage of SMT and you’re going to see more and more tuning from for
our products to take advantage of SMT very heavily. SMT is a big, big change for us architecturally
and you’re going to see us use this to be able to, in future generations, scale out
more and more and more and have more compute density. Single instruction multiple data is kind of
a fancy way of saying that instead of taking little bits of data and doing permutations
on them bit by bit, we do one instruction for large sets of data. Might have a spreadsheet,
right? You’ve got two columns, A and B, and you want to add them together and store them
in Column C. Well, what your computer currently does, if it doesn’t have SIMD capabilities,
it says, “Okay. A plus – A1 plus B1 equals C1. A2 plus B2 equals C2 and so on and so
forth.” That takes a long time. The beauty with SIMD is we just take that whole column,
add it to the other whole column, and store in a vector somewhere else. So you get much
greater throughput for technical workloads. We’ve optimized the software end of this,
as well. The tool chains, all the compilers for COBOL, PL1, Java, C/C++ now support SIMD.
We have math libraries available for LINUX and z/OS that you can plug into very easily
to take advantage of the scale that this offers, and Java, as well. We have a brand new coprocessor design on
each core. It has multiple functions – crypto, compression, hashing, stream conversion. It
is unique to each core and it accelerates certain cryptographic functions depending
on the actual algorithm that’s being used. Like I said, you can see anywhere between
2% and 3.5% depending on the encryption standard that you use. It’s very easy to take advantage of this acceleration.
We have hooks all throughout the system that are available kind of for free. The encryption
tools for DB2 and IMS, the z/OS communication server – might as well call it VTAM – system
SSL, Kerboros authentication, Java; and we’ve accelerated a lot of the crypto services on
LINUX, as well. You get all that for free. I should also point out, though, I haven’t
really touched on it too much but the compression performance here is also really good. So if
you guys have been using the onboard compression coprocessor in the past to do things like
compress DB2 row data, you’ll see improved performance here, as well. Alright, the system controller. This is one
of the other chips. This is your L4 and each one of these things has 480 megabytes of cache.
That’s a lot of cache. There’s a lot of array cells on here so we have to be very careful
about reliability. There’s tons and tons and tons of redundancy all throughout. We’re constantly
scrubbing and looking for errors so it doesn’t impact you. If a particle from Mars somehow
intersects the chip and flips a bit, you don’t have to worry about it. In addition, we’ve also added this thing called
the NIC. This is kind of the technical detail but let’s say you have a processor on another
drawer, right? And he wants to access some piece of storage that’s on your drawer. Well,
we want to make certain that the L4 contains as much relevant data as possible but we don’t
want it to contain stuff that’s already shared across other L3s. So we’ve got this little
directory to say, “Hey, other drawer? Yeah, we’ve got the storage. No problem, come in
and get it.” But we don’t actually store it in the L4, we have the L3 forward it. So
by the time the remote guy says, “I want the data,” he’s got it served and it’s forwarded
from the L3. It’s a way of maximizing the cache capacity across the L3 and the L4. This
is also where your fabric control is, your drawer-to-drawer communications, as well as
communications between the L4 and the various processor chips. So in aggregate, there’s a lot of cache on
this thing. There’s over a gigabyte of cache per drawer. Like I mentioned earlier, each
core has separate L1 and L2 IND caches, shared L3 cache, which then talks off chip to your
system controller, which has the L4. And then, on this chart, it shows it talking to memory
but it doesn’t talk to memory through the L4. It actually talks through the processor
chip because the memory controls are in the processor chip. But this is just a logical
view of how the storage hierarchy looks. And this is what the fabric looks like. We
have X-Bus communications on each logical node between the processor chips and the system
controller. Those run at the exact same speed as the processor nest, right? So you get a
lot of performance here. We had the S-Bus talk in between the two logical nodes and
the A-Bus to talk to remote drawers. It’s not a full interconnect between the nodes.
You don’t have like an X crossbar here. But because of the extraordinarily aggressive
memory affinity that we’ve baked into Prism this time around and something called Dynamic
Memory Relocation, it’s an accelerator engine in the L4; we are being extremely aggressive
about making certain that we move the storage that your workload is talking to as close
to the core as possible. So we’re trying to, as best we can, actually eliminate as much
of this cross node and cross drawer traffic as possible. It’s not necessarily that it’s
a huge performance impact but anything we can do to intelligently increase throughput
through memory affinity we’re going to do. The RAIM memory subsystem, so each processor
has a memory controller that supports RAIM upwards of 4 terabytes throughout the system,
two addressable terabytes per drawer. Now keep in mind that about 20% of your total
physical memory is actually redundancy. I’m going to explain why that’s really cool here
in a second. We also have the Centaur controller for vastly improved performance in bandwidth
on each memory card. Now RAIM’s really cool, right? RAIM is – I
mentioned that it was the most reliable memory system in the world and it is. This is a memory
system where if a pin goes down on the board, it can fail over. If multiple pins go bad,
it can fail into an ultimate clock. If chips go bad, no problem, we can mark DRAMs. Too
many chips? No problem, we can mark whole ranks. Too many ranks? No problem, we can
get rid of whole cards. In fact, if it wasn’t, you know, fatal, you could actually go into
the actual machine and start pulling out memory cards and the machine would keep on running. So it is an incredibly robust system and having
that big memory means big performance increases. It means reduced latency, increased throughput,
reduced response time, and it lets you do things that you never thought you could do
before on an entry level mainframe. You can have N memory data marks, for example, right?
Your guys don’t have to sit around constantly tuning because around problems like DB2 buffer
pools. There’s space now for that sort of thing. So big memory yields tremendous performance
improvements. I’d also like to start walking through the
I/O subsystem now if you’ll indulge me, and some of our offerings here. I mentioned earlier
a little bit about how we always try to have a balanced design for performance reasons.
One of the great things about IBM System z is the way we do I/O. There’s no machine in
the world that has the I/O connectivity and bandwidth that z does. And some of the reasons
for this is the fact that we throw massive amounts of processing capabilities at the
problem. On z, all your I/O processing is actually offloaded to other processors. So
in a fully configured system where you have all your I/O drawers populated with FICON
cards, for example, you’re going to have over two hundred and fifty power CPUs sitting there
chewing on your I/O. And that means that your main engines here can focus on what they should
be focusing on, which is compute problems. And that means that those can be more fully
utilized and it mean less cost to the customer. He’s not having to pay constantly to sit around
for the cost associated data moving with I/O processing. This is sort of the logical view. Like I said,
we really transitioned over to PCIe Generation 3. We’ve got a big focus on open standards
here. The bandwidth is much better, the performance is much better, and it’s going to open up
the system later on to broader offerings through third parties. Each processor, you will recall,
has two controllers available and we also still support all of your Infinity band based
I/O with a carry-forward drawer so you don’t lose your investment if you’re upgrading. You’ve got fan-outs here for your connections
to your I/O cages in front of each processor drawer in the center. It’s for your Legacy
I/O or for short range coupling. And these are the PCIe I/O cages, thirty-two slots available.
We’ve got them divvied up among different logical domains to do redundant interconnection
for liability purposes. So if somebody gets mad or something and goes in there and sort
of pulls something out, no problem, your I/O stays up. And you can see here how we do that
physically. We offer a lot of options for PCIe I/O. Our
FICON Express 16S cards, we’ll talk about that here in a second. I’m not going to go
through OSA-Express for networking because it’s not really new with this generation.
But I do want to talk about our RoCE Express Ethernet for networking. Our compression accelerator,
zEDC Express; our crypto accelerator, Crypto Express5S, which is unbelievably awesome;
and Flash Express, as well, which is a paging device that helps improve performance for
certain scenarios. FICON 16S. So your FICON cards are going to
be the workhorse of your I/O subsystem. We’ve got two flavors; one for long distance upwards
of 10 kilometers, as well as short distance. So whether you’re talking to a remote site
or whether you’re talking to your local storage area network or to other systems in parallel
Sysplex environment, these are the guys that are going to do it for you. They auto negotiate
their speed so they can talk to other versions of FICON Express technology. The speed performance
here is quite a bit better than in the past both for high performance FICON, as well as
fiber channel, the SCP. And the controllers here are faster, too, by the way. We’ve got a lot of really cool additions on
the technology side here. First of all, I’d just like to point out in terms of I/O connectivity,
these things can talk to upwards of 32,000 I/O devices apiece. So that means that in
theory, a fully configured z13s can be talking upwards of two million I/O devices. That is
an extraordinary number. We’ve got some improvements to high performance
FICON extended distance, too, up to 50% I/O service time improvement. We’ve got forward
error correction support, which actually lets us drive up the speed of the links and handle
errors that might be associated with that. So you don’t have to keep dropping down, depending
on problems with optics or distance or whatever else. This is really cool, extended link services.
I don’t know how many of you guys actually ever live in the data center but I can tell
you that diagnosing problems with optical cabling is a real pain. You know, you’re sitting
there with that tool and you’re looking at it and trying to figure out, “Is it the
connector?” We’re actually using ongoing analytics to sit there and figure out if there’s
an error, where is it? Is it in the wire? Is it in the port? Is it in the cable connection
itself? And it points that out to your administrator. That really helps in the event of an error.
And also, we’re extending the ability for workload manager in z/OS to actually use the
SLAs that you’ve defined to prioritize traffic in your storage area network. I mentioned earlier that the performance is
much better. It is much better. You’re going to see improved I/O ops per second both for
high performance FICON, as well as FCP, and a big increase in aggregate throughput here
– 60% increase in FCP, 63% increase high performance FICON. Okay, RoCE Express. This is a completely native
PCIe solution, by the way. There is no z specific ASIC or anything on here. You’ve got redundant
ports, again. Remember? The liability? High speed 10 gigabit Ethernet. Now, this is really
for – this is introduced for z13 and z13s. And one of the really cool things about this
particular networking solution is that you can actually share it between multiple z/OS
images. So you can have one physical card out there with a particular link out to your
network and it can be shared across multiple LPARs. Speaking of LPARs, I can’t believe I forgot
this. By the way, with the z13s, we’ve increased the number of LPARs up to forty. I don’t know
I didn’t mention that earlier but there you go. Okay, Crypto Express5S. This is a beast of
a piece of technology. This accelerates pretty much every aspect of cryptography on your
mainframe. It supports the latest and greatest encryption standards including Elliptic Curve
Cryptography, which is now in vogue, thanks to consumer technologies, your iPhones, your
Blackberries. Probably no one here with a Blackberry. It is FIPS 140-2 certified. And
this is also used in junction, by the way, with a TKE Workstation that you can hook up
on your system to allow complete end-to-end security if you’re going to manipulate the
master keys on this card. It’s twice the performance compared with our Crypto Express4S solution
– new cache, new ASIC – and it is PCIe based. You can run it in a couple different
configuration options. You can run it for clear key operations or SXL acceleration and
it supports pretty much every encryption standard known to man including – and it’s not on
this chart – the point-to-point encryption standard from Visa. You can accelerate this. The beauty of this is that it takes all of
the complication associated with keeping your data encrypted and secure, and it offloads
it off of your processors, again, right? We’re constantly trying to find new ways of offloading
things away from the processors where we can to increase throughput. We want to make certain
that those cores are working on compute problems only. We want to accelerate everything else. Flash Express. We’ve had Flash solutions before.
You can think of this as sitting in your memory stack a little bit beyond your main memory.
It’s used through EADM, which is a facility for z/OS. It’s also supported for LINUX. And
it basically absorbs those moments where your workloads spike and you run out of memory
and you’ve got to page out. Paging out to DASD is not something you really want to do
on a regular basis, right? It hurts your performance and it costs you. Because any time you have
that data movement, it costs you. With the Flash solution, you get much better throughput
for those scenarios and it’s supported by a tremendous amount of our enterprise software
stack – Java, Kix, WebSphere, DB2, IMS. You can also use this, by the way, to absorb
some of the structures that the coupling facility uses. You can use it in queue overflow issues.
You can use it when you’re doing diagnostics, right? When you do diagnostics, your big auditing,
you get these massive data sets and sometimes they grow outside of memory. You can page
into this, you get much better performance. I actually wanted to show you a quick graph.
I don’t know how many of you all have seen this. But this is a transition test, the type
of thing you might have on your mainframe in the morning when certain subsystems are
firing up. And we ran out of memory and we started to page out. Well, when you’re paging
out to DASD, it took about forty-four seconds before we reached steady state performance.
By the addition of Flash, it only took ten seconds. And for security reasons, by the
way, because this is talking directly to memory, you’re looking at 1.4 terabytes of AES encrypted
enterprise grade FLASH. It also has a 512 megabyte cache. You can configure your system
with multiple Flash Express cards and we’ve got redundant connections, as well, to set
up a ray-to-ray for reliability purposes. Compression, okay. So we do have – just
like the Crypto solution where we do have a coprocessor on the processor but we’ve also
got a big accelerator card. Same thing with compression. zEnterprise data compression
is really designed for massive data sets, right? It’s one thing when you’re doing compression
on a transactional basis. You know, this type of thing, it could be used but it’s better
to use the onboard coprocessor. But when you’re talking about big data sets, you want to offload
all of that compression. The compression that this offers, much better performance and much
better compression of ratios, as well, actually. We’ve got support for it throughout a lot
of the file systems, through z/OS, our sequential data sets, and again, anything that offloads
these types of functions off the processor means you’re not paying for all that data
compression on your prims charging with your third party ISV software. Speaking of third party ISVs, this has been
opened up to the ISV world and there’s lots and lots of third party software and technologies
that take advantage of the accelerated compression that this offers. It’s supported by all of
our operating – it’s supported by z/OS, it’s supported by z/VM. You can have, I think,
upwards of eight of these in a given frame. We actually use it internally for our employee
directory and some of the compression ratios that this produces are really, really incredible. I mentioned earlier that we have a new short
range coupling card. This goes in the fan-out space where you would normally have your Infinity
band fan-out cards. Better performance than before. This is really only for in-house parallel
Sysplex coupling and it is only supported currently for the z13 and z13s but the performance
benefits here are really substantial and we highly recommend folks use it. Now, all of these hardware goodies mean absolutely
nothing if there’s no software availability to take advantage of it. So the various software
boffins at IBM and across out operating systems group, middleware group, and applications
development group have really been working to harness the large memory SIMD capabilities
– SMT across every layer, every level of our software stack. z/OS currently scales
up to the maximum number of available LPARs; forty, as I mentioned. It takes advantage
of all the crypto and compression acceleration that we talked about whether it’s the on-processor,
coprocessor, or whether it’s the Crypto Express or zDAC cards. We’ve got improvements to z/OSMF
for those of you who don’t like doing TSO logons. I kind of do. VSE is actually going to be delivered in a
new LPAR type called Zackum. We’re going to talk about there in a second. It’s encapsulated
in this. It’s got better support for cryptography acceleration and we’re replacing the networking
sack underneath using LINUX to provide substantially better TCP/IP performance. There has been
a huge amount of work on the LINUX stack with an IBM to take advantage of all of the new
technologies within the processor complex. This is fantastic – GDPS. We’re going to
deliver this in a virtual appliance. And it’s actually got encapsulated z/OS in it, right?
We wanted to deliver it in such a way that you interact with it only through a web GUI.
Because not everybody out there is familiar with z/OS. A lot of the newer younger people
might not be familiar with this sort of thing. I saw a demo of this. It’s just the most amazing
thing. You know, you’ve got Machine A that’s running a LINUX workload and you’re talking
to it through a web server. And you’ve got Machine B offsite, right? He’s running. Take
the machine offline, the workload goes, “Boop,” and all of a sudden, now you’re talking over
here almost instantaneously. This is some of the best disaster recovery technology in
the world and bringing it to LINUX, I think, is a really big benefit for open source on
z. We’re supporting also Ubuntu, as well as SUSE
and Red Hat. TPFs, the tool chain will be updated, take advantage, for example, of SIMD
in the future. There’s a big new release of VM coming up. A preview here shortly at the
end of the year. Again, this is to take advantage of some of the new processor technology we’re
introducing. It also has a lot of performance improvements for the SCSI subsystems, more
large pages. Pipeline’s been cleaned up quite a bit. It now supports Ubuntu, as well, and
there’s a whole host of improvements. I couldn’t fit it all in one chart. So for any of you
all who are interested in VM, I can point you to a letter that we’ve published that
had a lot of the features that are coming. Okay, you all remember how I was talking a
little bit about HiperSockets earlier? Well, we’re going to talk about that in a second.
But I want to talk about a new software technology called Shared Memory Communication R – SMC-R.
And what this enables you to do is you’ll have two physical systems. The OS image is
on each system. And they’ve got a dedicated link through a RoCE card – that’s our 10-gigabit
Ethernet card you remember we talked about earlier – and they have these shared memory
structures. And so what’s happening is we’re kind of completely bypassing the TCP/IP stack
to talk to each other. It’s kind of like HiperSockets but across multiple physical machines. The
performance improvement here is really, really substantial. For bursty type traffic, you
can see latency go down really substantially, upwards of 80%. For streaming type workloads,
you’re going to see a substantial improvement in the amount of CP utilization that is associated
with that. It goes down by quite a bit because you don’t have to do all this networking work. Now, VTAM, the communication server, will
automatically switch between TCP/IP and this new technology, SMCR, where it’s available.
You don’t have to worry about it, completely transparent to your applications. You don’t
have to do anything, right? New technology. If you’re talking between z/OS images across
TCP/IP, if this is available, it’ll automatically use it and your performance is going to improve. We also have a variant of this that works
within a given physical frame. This is unbelievable. I mean, I thought HiperSockets, right, were
fast. This thing, I have no idea how they’ve done this but the performance improvement
is very substantial. If you have two z/OS images and you want to communicate between
them; you have this shared memory structure where they’re writing into and reading out
of. And for streaming type workloads like Communications Server or Apache, those types
of things, you’re looking at a nine fold improvement in throughput, 99% decrease in CPU consumption,
and a 90% decrease in response times for these types of workloads. Again, I saw a demo of
this, you know. We got a, I think if I recall correctly, it was Apache talking to some application
server in a different image. And we have some analytics sitting there, you know, taking
a look at throughput numbers so you know, it’s doing one of these things. They turned
this thing on, it went, “Shoop,” right through the roof. So this is an amazing technology.
It is available completely transparent to your applications. We’re baking support into
this, into our operating systems, and you’re going to see a tremendous performance, positive
performance delta associated with this in a lot of things. I mentioned earlier that we’re going to deliver
ZAKI. This is a new z-specific container technology. It’s a way of taking an operating system,
middleware, applications and so forth, putting it into a completely secure container so you
can include proprietary applications and technology and not worry about that. And the user interacts
with it via restful APIs, web services and so forth. We’re going to deliver VSE this
way going forward. It’s a brand new LPAR type. It’s supported by both the z13 and the z13s. And one of the other things that we’re going
to deliver with this container technology is zAware. I don’t know if you guys are familiar
with zAware but zAware uses advanced analytics to monitor your systems, looking at the performance
of workloads, looking at things like console logs, and trying to find functional and performance
issues as they occur or in many cases, before they occur. And in Version 2, we’re actually
extending that type of analytical capability to finding problems to LINUX. So no more sitting
around searching through VAR log messages or anything like that, or the logs for all
of your middleware solutions. This is going to do it for you and it’s going to help you
find problems before they occur. Brand new web-based GUI, very easy to use. Even I can
use it. You can find all sorts of bizarre performance or functional issues. It’s found
them for you in as little as two clicks. So I mentioned earlier that we’ve got a really
big push into LINUX, right? Well, we do. And it’s not just associated with the LinuxONE
effort. On z now, we support, in addition to the three big distributions, as well as
bringing in tambien, I’m going to talk actually about another hypervisor technology here in
a second. We support all of the popular open source run times, all the popular source –popular
open source container solutions, language tool chains, databases, analytics capabilities.
And we enable you to take these technologies that you may be familiar with from the distributed
world and deploy them on a system with much, much, much greater scale; much, much, much
greater IO capability at much reduced cost. And one of the new management tools that we
are shipping here associated with this announcement is the Dynamic Partition Manager. And this
is a management layer for LINUX and it can be talking to LINUX partitions whether they’re
running on top of LPAR or whether they’re running on top of KVM. It’s got a very easy-to-use
front end. You can set up all sorts of alarms and rules for events, conditions, state changes.
And I think the really cool thing about it is it brings some of the capabilities that
you might be familiar with from z/VM to the LINUX world, including the ability to have,
while your LINUX system images are running, manipulate, for example, I/O configs underneath
it and have the workloads continue to run, which I think is a really cool piece of technology. The only downside to this is that LINUX images
are running on top of this zAware, can’t see into them yet. You never know what’s going
to happen. We also have big improvements for KVM, which is the very popular open source
hypervisor for LINUX. This has been updated to take advantage of SIMD and SMT. We’ve got
new problem determination technologies we’ve put in here for better availability. It takes
advantage of crypto acceleration. And of course, it supports Ubuntu, as well, so we’ve put
a lot of work into KVM. I should also point out that in addition to
all these open source tools, we’re also extending our enterprise level software solutions like
z/OS – we have a z/OS Connect that lets you actually use restful APIs to kick off
things like batch jobs. I can’t tell you how cool it is to be able to sit here on your
phone, right? And actually have JES kick stuff off for you by pressing a button. We’ve extended z/VM to be able to interact
with open stack management tools. So the idea here is whether you’re a mainframer like me
or whether you’re brand new to the mainframe and more familiar with the cloud solutions
and management solutions that you’re familiar with from the distributed world, they exist
both on z. Okay. Before I close up, I just want to say
I am one of more than a thousand engineers at IBM that works on IBM z – the hardware,
the software. And we’re very passionate about this platform. We worked really hard on it.
And the thing that I’ve really enjoyed about this conference is getting to talk to some
of you all and some of the customers and see what you’re doing with the babies that we’ve
built. Because for us to see some of the innovations that you all are bringing to your businesses
that we never even imagined is tremendously gratifying. So I want to thank you all for your support
of the platform and I want to thank you all for listening to my spiel.