What is MIPS, are we being misled By the Term "MIPS"?

Post by **Anuj Dhawan** » Thu Jul 18, 2013 11:18 am

Since so long I wanted to start a discussion on this term known as MIPS - and I think, One of the Most misused terms in IT has to be MIPS. It's supposed to stand for "millions of instructions per second," but many alternate meanings have been substituted:

Misleading indicator of processor speed
Managements impression of processor speed
Meaningless indicator of processor speed

and probably many more.

Well, jokes apart but it's none of them. In a crude sense, MIPS is a "through put" of a system and one, in the Business and working with IBM Products, must understand that IBM has stopped using this term long back and using "MSU" instead.

I'd request the members to please chip-in in this discussion to conclude on how logical is it to say "MIPS reduction in 2013", when one in the management suddenly starts talking about to find out a figure to represent a processor's capacity and in turn the "Software Engineers" starts spending large amounts of money based on a poorly understood indicator, for both software and hardware perspective.

There is lot more to talk about and I'll get back to this half completed topic soon - just now I'm getting a call for a Ticket in Production...

Be well,

Later

Post by **Robert Sample** » Thu Jul 18, 2013 4:48 pm

People want to misuse terminology. Even when IBM was using the term, MIPS never (I repeat, NEVER) referred to anything less than a processor (an LPAR in current terminology). If someone wants to cut the total CPU seconds used by an application, then that is the phrase to use -- suggesting they want to "reduce MIPS" indicates only a complete lack of understanding of capacity planning and resource consumption. The ONLY way to "reduce MIPS" is to take out the current box (processor) and put a smaller box in its place. Yes, the entire machine must be replaced to "reduce MIPS". Something similar can be done by LPAR capping but that is not reducing the MSU (MIPS), that is reducing the amount that the site can use. The processor still runs at X MSU, but the site doesn't get to use the full speed.

I've seen people refer to an application using so many MIPS. No, it does not -- unless an entire machine (LPAR) is dedicated to that application. It may use so many CPU seconds, or so many I/O, or so much memory -- but it does not use MIPS (or MSU).

Post by **Sachin Kumar** » Fri Jul 19, 2013 12:25 pm

This is an amazing discussion. I've been doing this all the time and was using term so frequently. Looking forward to listen more on this.

Post by **Anuj Dhawan** » Tue Jul 23, 2013 11:49 am

You've rightly said this Robert - Million Instructions Per Second is The execution speed of a computer. For example, .5 MIPS is 500,000 instructions per second; 100 MIPS is a hundred million instructions per second. Unless 'am mistaken, MIPS was a popular rating before computers reached gigahertz speeds, but MIPS rates were never uniform. In addition, it takes more instructions in one machine to do the same thing as another (RISC vs. CISC (As Andy Grove mentions about it in "... Paranoid Survive"), mainframe vs. PC). As a result, MIPS has been called with many alternative names, which are mentioned before.

Again - MIPS measures roughly the number of machine instructions that a computer can execute in one second. However, different instructions require more or less time than others, and there is no standard method for measuring MIPS. In addition, MIPS refers only to the CPU speed, whereas real applications are generally limited by other factors, such as I/O speed. A machine with a high MIPS rating, therefore, might not run a particular application any faster than a machine with a low MIPS rating. And all these reasons are enough for not to sue MIPS ratings anymore for this very purpose. Perhaps now it's just a 'Meaningless Indicator of Performance'.

Post by **Anuj Dhawan** » Tue Jul 23, 2013 11:50 am

Thanks for stopping by Sachin - hope you enjoy reading all this.

Post by **Anuj Dhawan** » Fri Dec 20, 2013 5:22 pm

I was thinking that I'm done with this BUT - to use the phrase "we are doing MIPS optimization" or not to use it, is the question! Any (more) thoughts?

cambridge-apt · Post by **cambridge-apt** » Fri Dec 27, 2013 3:24 pm

I have found a good piece of work devoted to MIPS, MSUs and CPU usage optimization in large mainframe environments. Have a look at this

https://www.academia.edu/2078094/Reduci ... management

More in:
http://dare.ubvu.vu.nl/bitstream/handle ... sequence=1

Post by **Gerhard_Adam** » Sun Nov 29, 2015 1:37 am

MIPS and MSUs are abused terms. The processor measure is never evaluated in MIPS. It is assessed using Service Units and then related by ITR {Internal Throughput Rate]. MIPS is absolutely meaningless without instruction measurements and understanding instruction pipeline flow.

MSUs do NOT represent capacity, except as a software licensing charge measure for the processor. MSUs should NEVER be used for performance or capacity planning.

BTW, if anyone ever suggests you can't use Service Units and should use MIPS, you should understand that every MIPS chart produced uses service units to get their numbers.

To assess processors, use IBM's LSPR which explains both the measures and the workload benchmarks used to derive them.

https://www-304.ibm.com/servers/resourc ... nt&pathID=

Post by **Sachin Kumar** » Mon Nov 30, 2015 12:26 pm

I am badly confused now. So if we should not be looking at MIPS and we should also not be talking about MSUs then why the terms like MIPS optimization are so popular? And then how exactly should we do the "MIPS optimization"??

Post by **nicc** » Mon Nov 30, 2015 3:43 pm

Because people do not know what they re talking about, or have not kept up with the terminology. As there is no such thing as "MIPS Optimisation" you cannot do it. This does not mmean that you cannot tune things to make more efficient use of the CPU power.

Post by **Sachin Kumar** » Fri Apr 01, 2016 9:27 am

Hi,

There is one more question I would like to ask here - if we have a LPAR running say 1000 jobs. Now we say
[ol][li]we need to do the MIPS optimization for this LPAR[/li]
[li]We know there are some high CPU using job. We need to do reduce the CPU use of these jobs.[/li][/ol]
In these two case what's the difference between MIPS optimization and CPU reduction activities?

Post by **Nick Jones** » Fri Apr 01, 2016 9:31 am

Sachin Kumar wrote:Hi,

There is one more question I would like to ask here - if we have a LPAR running say 1000 jobs. Now we say
[ol][li]we need to do the MIPS optimization for this LPAR[/li]
[li]We know there are some high CPU using job. We need to do reduce the CPU use of these jobs.[/li][/ol]
In these two case what's the difference between MIPS optimization and CPU reduction activities?

To make it more muddy, what is performance tuning then?

Post by **Robert Sample** » Fri Apr 01, 2016 6:08 pm

I want to start off apologizing for the length of this post, but there's a lot to be said here. As usual, there is some simplification here so doing in-depth research will improve your understanding of what is discussed here, and possibly change it.

First, I did a Google search for the term "mips optimization" and it appears to be a term used exclusively by one company - an India-based consulting company. If you want to know more about it, I recommend you contact the company and ask them. It is certainly NOT a standard industry term.

Second, computer measurement, performance tuning, and capacity planning are extremely complicated topics and you're not going to get more than a smidgen of information on them from this or any forum. Visit the Computer Measurement Group website (there are others around) and start reading if you want to know more.

Third, MSU and MIPS are THEORETICAL terms that have picked up practical usage. I'll use as an example the z800-001 a previous employer had. If you check the charts, that machine was rated at 32 MSU or 192 MIPS (I believe the MIPS is down to 188 on the latest charts but haven't confirmed that lately). That means the machine has the capability to process 32 million service units per hour. A service unit is a measure of work (CPU, I/O, storage and SRB are the 4 IBM uses) and many if not most systems will show the service unit usage on the job output. Every 24 hours that machine can process 32 times 24 or 768 million service units of work. As long as that machine is there, that is the MSU for the machine. The only way to change the MSU is to upgrade or downgrade the physical box -- which sometimes can be done in the field by IBM and sometimes requires replacing the box. The -001 means that machine had one engine (one processor) to perform the 32 million service units. The z800 also had -002, -003, -004 models with 2, 3, 4 engines. So if I think I have 125 million service units of work per hour to perform, I could get a z800-004 and be happy, right? No - the incremental service units goes down as the number of engines increases as they have to spend time communicating with each other to make sure nothing goes awry. So a z800-004, despite having 4 times the engines of a z800-001, was only rated at 108 MSU (636 MIPS) not the 128 that 4 times 32 indicates.

Side note: do not assume that 6 MIPS per MSU is a constant -- it varies by the machine. The zBC12, for example, appears to be running about 8 MIPS per MSU.

Fourth, when someone says a job uses 3 MSU (for example), I assume that they have looked up the actual service units used by the job and divided the total service units by the number of seconds the job executed and converted to MSU. If all that has been done, you can -- sort of -- discuss reducing the service unit usage of the job. Any other use of the term MSU for a job is incorrect at best and wildly wrong at worst.

Fifth, the LPAR is running 1000 jobs -- but you didn't quantify that. Is that 1000 jobs a day? week? month? year? forever? And the number of jobs doesn't really matter -- what matters is the CPU usage, channel usage, storage usage, SRB usage and the average / peak usage of each of these values. You could, if the system is set up for it, run 1000 jobs every second and as long as they are short, low-impact jobs you may never notice them in the system.

Sixth, jobs are not usually where the system is most active, even though that's what the applications programmers see. My current employer's system has 4 LPARs defined and allows up to 600 address spaces (batch jobs, TSO users, started tasks, and OMVS processes) to be executing at one time. At any given time, just under 300 of those address spaces are running system tasks (data bases, CICS, RACF, WLM, JES, and so forth) so if some of the system tasks can be tuned more efficiently, that will usually have a larger impact that batch jobs.

Seventh, a computer program is bottlenecked. Always. If it didn't have bottlenecks it could get done in zero seconds and make everyone very happy. Bottlenecks will either be CPU (CPU-bound) or I/O (I/O-bound), where I/O can represent delays due to the channel, or the disk drive, or the control unit, or the tape drive, or the buffering of data, or .... If a program is I/O-bound, you could give it the fastest CPU in the world and the job would take exactly the same amount of time. Similarly, if a program is CPU-bound, changing from tapes to disks to solid-state devices will make no difference to how long the job takes. If you don't know the a given job is CPU-bound or I/O-bound, there is not much you can do to improve its performance since you don't know where to start. If your high-CPU-using jobs are I/O-bound, even though they are using a lot of CPU time, then you won't be able to impact their CPU usage much. If they are CPU-bound, you look at the standard culprits first -- buffers, access patterns for the data, code that is doing too much, inefficient code (which should be near the bottom of anyone's tuning list).

Eighth, performance tuning should (but doesn't always) start from a global approach. Frequently, changing the start times of certain jobs to reduce conflicts with other applications can raise the overall throughput of the system. Performance tuning should be looking at the batch window to see what can be done, as well as looking at the online processes to see what can be done. Sometimes adding a database on a different DASD channel may well speed up online processing, even though overall CPU time goes up because of the additional database. Workload management policies can have a big impact -- I have a system where I changed the WLM policy for batch processing, and the elapsed time for one job dropped from 14 hours to 45 minutes (the second was a rerun so the record counts were exactly the same).

Ninth, the advent of specialty engines muddies the MSU / MIPS picture. A processor (engine) on a system z machine can be a CP (general purpose processor), SAP (System Assist Processor -- dedicated I/O processor), IFL (Linux-only), ICF (Coupling facility-only), zAAP, or zIIP -- and each LPAR can be on a dedicated engine or share the use of an engine. If a machine has 20 engines with 40 LPARs defined on it, with some of these LPARs being CP and some IFL and some ICF and some zAAP and zIIP, the very concept of MIPS starts getting pretty fuzzy -- especially since some of these engines may be running at different speeds.

Post by **Anuj Dhawan** » Mon Apr 04, 2016 12:09 pm

Robert Sample wrote:I want to start off apologizing for the length of this post, but there's a lot to be said here.

That's an excellent post there! I feel the lack of 'like button' now, on the website.

MIPS is a term, I have seen being used so badly that it seems to loose its context. And for that matter, MIPS has not been a meaningful measurement for decades. IBM itself has stopped using it from optimization per se. If you are in the business of Mainframe Performance, this is perhaps the hardest concept to put across to bean counters. But as the word "optimization" is very high-sounding, I think almost everyone buys in to it and relate it to the $-savings - and the MIPS continues to be misused.

OTOH, the other term which I had been reading some time back was - ITR, Internal Transaction Rate. I don't have enough statistics with me to say it for sure but, I believe, IBMs preferred method to size a mainframe is ITR. It is published in its LSPR (Large Systems Performance Reference). This is a number that compares a processor to IBMs base 2094-701 processor, which has an ITR of 1.0. Though ITR is also not perfect but is closest. As Robert has said, with large differences and mix of workload in the latest processors, one might observe a visible difference between given shop's processor's workload and the ITR workload. The good news is that IBM has created a PC-based tool to calculate a system's ITR value from RMF and other data. Tool is known as zPCR (System z Processor Capacity Reference) - what more you need, it can be downloaded free from the IBM website. http://www-03.ibm.com/support/techdocs/ ... ex/PRS1381 and can be used for your shop.

Post by **Sachin Kumar** » Mon Apr 04, 2016 2:02 pm

Robert Sample wrote:I want to start off apologizing for the length of this post, but there's a lot to be said here. As usual, there is some simplification here so doing in-depth research will improve your understanding of what is discussed here, and possibly change it.

First, I did a Google search for the term "mips optimization" and it appears to be a term used exclusively by one company - an India-based consulting company. If you want to know more about it, I recommend you contact the company and ask them. It is certainly NOT a standard industry term.

Second, computer measurement, performance tuning, and capacity planning are extremely complicated topics and you're not going to get more than a smidgen of information on them from this or any forum. Visit the Computer Measurement Group website (there are others around) and start reading if you want to know more.

Third, MSU and MIPS are THEORETICAL terms that have picked up practical usage. I'll use as an example the z800-001 a previous employer had. If you check the charts, that machine was rated at 32 MSU or 192 MIPS (I believe the MIPS is down to 188 on the latest charts but haven't confirmed that lately). That means the machine has the capability to process 32 million service units per hour. A service unit is a measure of work (CPU, I/O, storage and SRB are the 4 IBM uses) and many if not most systems will show the service unit usage on the job output. Every 24 hours that machine can process 32 times 24 or 768 million service units of work. As long as that machine is there, that is the MSU for the machine. The only way to change the MSU is to upgrade or downgrade the physical box -- which sometimes can be done in the field by IBM and sometimes requires replacing the box. The -001 means that machine had one engine (one processor) to perform the 32 million service units. The z800 also had -002, -003, -004 models with 2, 3, 4 engines. So if I think I have 125 million service units of work per hour to perform, I could get a z800-004 and be happy, right? No - the incremental service units goes down as the number of engines increases as they have to spend time communicating with each other to make sure nothing goes awry. So a z800-004, despite having 4 times the engines of a z800-001, was only rated at 108 MSU (636 MIPS) not the 128 that 4 times 32 indicates.

Side note: do not assume that 6 MIPS per MSU is a constant -- it varies by the machine. The zBC12, for example, appears to be running about 8 MIPS per MSU.

Fourth, when someone says a job uses 3 MSU (for example), I assume that they have looked up the actual service units used by the job and divided the total service units by the number of seconds the job executed and converted to MSU. If all that has been done, you can -- sort of -- discuss reducing the service unit usage of the job. Any other use of the term MSU for a job is incorrect at best and wildly wrong at worst.

Fifth, the LPAR is running 1000 jobs -- but you didn't quantify that. Is that 1000 jobs a day? week? month? year? forever? And the number of jobs doesn't really matter -- what matters is the CPU usage, channel usage, storage usage, SRB usage and the average / peak usage of each of these values. You could, if the system is set up for it, run 1000 jobs every second and as long as they are short, low-impact jobs you may never notice them in the system.

Sixth, jobs are not usually where the system is most active, even though that's what the applications programmers see. My current employer's system has 4 LPARs defined and allows up to 600 address spaces (batch jobs, TSO users, started tasks, and OMVS processes) to be executing at one time. At any given time, just under 300 of those address spaces are running system tasks (data bases, CICS, RACF, WLM, JES, and so forth) so if some of the system tasks can be tuned more efficiently, that will usually have a larger impact that batch jobs.

Seventh, a computer program is bottlenecked. Always. If it didn't have bottlenecks it could get done in zero seconds and make everyone very happy. Bottlenecks will either be CPU (CPU-bound) or I/O (I/O-bound), where I/O can represent delays due to the channel, or the disk drive, or the control unit, or the tape drive, or the buffering of data, or .... If a program is I/O-bound, you could give it the fastest CPU in the world and the job would take exactly the same amount of time. Similarly, if a program is CPU-bound, changing from tapes to disks to solid-state devices will make no difference to how long the job takes. If you don't know the a given job is CPU-bound or I/O-bound, there is not much you can do to improve its performance since you don't know where to start. If your high-CPU-using jobs are I/O-bound, even though they are using a lot of CPU time, then you won't be able to impact their CPU usage much. If they are CPU-bound, you look at the standard culprits first -- buffers, access patterns for the data, code that is doing too much, inefficient code (which should be near the bottom of anyone's tuning list).

Eighth, performance tuning should (but doesn't always) start from a global approach. Frequently, changing the start times of certain jobs to reduce conflicts with other applications can raise the overall throughput of the system. Performance tuning should be looking at the batch window to see what can be done, as well as looking at the online processes to see what can be done. Sometimes adding a database on a different DASD channel may well speed up online processing, even though overall CPU time goes up because of the additional database. Workload management policies can have a big impact -- I have a system where I changed the WLM policy for batch processing, and the elapsed time for one job dropped from 14 hours to 45 minutes (the second was a rerun so the record counts were exactly the same).

Ninth, the advent of specialty engines muddies the MSU / MIPS picture. A processor (engine) on a system z machine can be a CP (general purpose processor), SAP (System Assist Processor -- dedicated I/O processor), IFL (Linux-only), ICF (Coupling facility-only), zAAP, or zIIP -- and each LPAR can be on a dedicated engine or share the use of an engine. If a machine has 20 engines with 40 LPARs defined on it, with some of these LPARs being CP and some IFL and some ICF and some zAAP and zIIP, the very concept of MIPS starts getting pretty fuzzy -- especially since some of these engines may be running at different speeds.

Thank you for the long explain.

Fifth, the LPAR is running 1000 jobs -- but you didn't quantify that. Is that 1000 jobs a day? week? month? year? forever? And the number of jobs doesn't really matter -- what matters is the CPU usage, channel usage, storage usage, SRB usage and the average / peak usage of each of these values. You could, if the system is set up for it, run 1000 jobs every second and as long as they are short, low-impact jobs you may never notice them in the system.

Here I was saying 1000 jobs a day.

Post by **Robert Sample** » Mon Apr 04, 2016 5:39 pm

Reducing CPU usage of a job depends largely upon the program(s) being executed. If the high CPU is due to a DFSORT that is sorting a billion records, then there isn't much of anything that can be done to reduce CPU time -- DFSORT, like many utilities, is tuned for system performance already. If the high CPU is due to an application program, then you need to use a source code analysis tool such as STROBE to look at where the use of CPU time is. If the STROBE report shows no source statement using more than 4 or 5% of the total time, you're not going to be able to do very much to reduce CPU time without completely rewriting the program. I did have one case at a previous employer where STROBE showed 98% of the CPU time for a program was in a single COBOL statement -- an INITIALIZE that was running on a table of 99 elements with 9999 bytes per element (basically 1 million bytes). Since only some of the elements were needed (the rest were for future use), I recommended changing the code to initialize only those elements being used. This change reduced the elapsed time for the job from 2 1/2 hours to 15 minutes, and the CPU time used dropped proportionally. This was, however, an unusual case.

Without a source analysis tool like STROBE, you're just guessing at what's causing the high CPU usage and you are more likely to be guessing wrong than right. For the INITIALIZE program problem, the development manager kept insisting we needed to install faster disk drives because the program had to be I/O-bound to be taking that long. After the changed code ran the first time, he decided we didn't need faster disk drives after all.

Post by **Gerhard_Adam** » Sat Jun 25, 2016 8:32 am

As I mentioned previously, MSU's [Millions of service units per hour] should NEVER be used as a measure of performance or capacity. They are numbers that were made up to account for a basic size of a system and are used to establish software licensing costs. They do NOT reflect actual resources consumed.

The service units [CPU and SRB] reflect the service consumed and is based on the SRM Constant which is used to normalize service usage across various processor models. In addition, the numbers published by IBM are adjusted to account for the MultiProcessing effect [MPE] when multiple engines are involved in an LPAR.

However, the primary reason why MIPS are meaningless is that they don't reflect the actual instruction processing on the machine. CPU's are pipelined and super-scalar which means that instruction streams can be executed in parallel and even out of sequence, with the instruction element performing some operations that don't require the execution element. Coupled with micro-code assists, the notion of "millions of instructions per second [MIPS]" is a complete fantasy. Even worse, if you compare the supposed MIPS charts generally available, you'll see that they are little more than an adaptation of the IBM LSPR document and therefore a waste of time.

As has been mentioned, you cannot reduce MIPS or service units or anything for a job in a general sense. Any such reduction REQUIRES that the job is inefficient and consuming resources that could be made available for other work. Without that qualification there is no way to reduce CPU consumption. In other words, something is being wasted.

Bear in mind that EVERYTHING that occurs consumes CPU at some level. I/O activity will consume CPU for the I/O operation, interrupt handling, the dispatcher, etc. As a result, reducing I/O activity will also reduce CPU activity. Similarly if the system is paging, then CPU resources will be consumed to handle those memory demands.

Anything that is done that doesn't absolutely need to be done is potentially a waste and potentially capable of being tuned.

Mainframe, MVS and zOS Discussion

What is MIPS, are we being misled By the Term "MIPS"?

What is MIPS, are we being misled By the Term "MIPS"?

Re: Don't Be Misled By MIPS.

Re: Don't Be Misled By MIPS.

Re: What is MIPS, are we being misled By the Term "MIPS"?

Re: What is MIPS, are we being misled By the Term "MIPS"?

Re: What is MIPS, are we being misled By the Term "MIPS"?

Re: What is MIPS, are we being misled By the Term "MIPS"?

Re: What is MIPS, are we being misled By the Term "MIPS"?

Re: What is MIPS, are we being misled By the Term "MIPS"?

Re: What is MIPS, are we being misled By the Term "MIPS"?

Re: What is MIPS, are we being misled By the Term "MIPS"?

Re: What is MIPS, are we being misled By the Term "MIPS"?

Re: What is MIPS, are we being misled By the Term "MIPS"?

Re: What is MIPS, are we being misled By the Term "MIPS"?

Re: What is MIPS, are we being misled By the Term "MIPS"?

Re: What is MIPS, are we being misled By the Term "MIPS"?

Re: What is MIPS, are we being misled By the Term "MIPS"?

Create an account or sign in to join the discussion

Create an account

Sign in