Quantcast
Channel: Musings of an IT Implementor
Viewing all 141 articles
Browse latest View live

Analysing High SAP Roll Wait Time - NW702

$
0
0
One of my clients identified an increase in the Response Time measurements they monitor.  They use the EarlyWatch Report to monitor the SAP system SLAs against response time performance.
I initially thought this was going to a be a nice easy assignment, identify the database performance issues and propose some reorganisation work.  After all, this does seem to be the case in ~75% of the cases I've seen.

Basic info: SAP NW 702 on Oracle 11gR2 on Unix.
SLAs are monitored against DIALOG response time from 9am to 6pm weekdays only.

I brought up the ST03 screen and checked the "Ranking Lists -> Top Response Time of Dialog Steps" (also known as the Hit List) for the past 3 month's data, one month at a time.
All three months showed a similar pattern.  The worst performing reports were customer developed reports.  The database time looked reasonable (< 40% of Response Time - Wait Time), processing time was long-ish and load/gen time was tiny.
What didn't look good was the "Roll Wait Time".

I backtracked a step and looked again at the "Workload Overview", and specifically tab "Parts of response time":

MWSnap101 2014-01-02, 09_55_04

That's right, 26% of the response time for Dialog, is Roll Wait time.
Time to dig out the definition of SAP Roll Wait Time...
I've blogged about the basic performance tuning steps before, but I've yet to deep dive into the Roll Wait Time metric.

What is "Roll Wait Time".
SAP note 8963 says:
"The roll wait time is time that elapses on the client while the user context is in the roll area. Therefore, no resources are used on the client and there is no bottleneck here if the roll wait time is long. "
So we may not be looking at a problem with the SAP server, but a problem between the SAP application and the front-end.
The note goes on to state that for DIALOG processing, the roll out of the context can be caused by calls to other external applications from the R/3 system, or from a call to an external RFC.
More importantly, it says:
"As of Release 4.6, roll wait times also occur when the R/3 system communicates with the controls in the front end. While the data in the front end is processed, the R/3 context is rolled out, and the work process is released as a result. This may occur several times during a transaction step. If front-end office applications (such as Microsoft Word) are started and only closed after a long time (more than several minutes), a very long roll wait time also occurs."
This means that the communication to the front-end (SAP GUI, Microsoft Excel, Word etc), can cause the DIALOG work process to roll out, subsequently increasing the "Roll Wait Time".
Even further clarification is provided in SAP notes 364625 and 376148 which mention the "new" GUI controls introduced in R/3 4.7.
SAP note 606610 explains how an ABAP "WAIT" command causes a task to roll-out of the work process.

SAP note 203924 provides some more detailed information on high Roll Wait Time:
"As of Release 4.6 the roll wait time (and therefore the total response time) contains the time for installing "Enjoy Elements" (=controls) and therefore the essential part of the communication between the GUI and the R/3 instance. In this case, the response time displayed in Transaction ST03 is closer to (almost identical to) the time experienced by the user."
Plus it also confirms what we're thinking:
"As of Release 4.6, therefore, a high roll wait time is a first indication of a slow network between the GUI and the R/3 instance."
Section 2a of the note provides some good pointers in diagnosing network performance issues and checking the size of the SAP user menus.

As per the note, I opened up transaction ST06 and via the "Detailed Analysis" button, I went to the "LAN Check by PING" screen, then clicked "Presentation Server":

 MWSnap104 2014-01-02, 10_57_13


MWSnap105 2014-01-02, 10_57_22


MWSnap106 2014-01-02, 11_02_32


Once here, I selected a random selection of "Presentation Server" clients and initiated a 10x PING.
What I found was that specific "Presentation Servers" (client PCs) were not responding within the expected 20ms:

 image


I knew that we were operating in a WAN environment (multiple offices across different cities) so I should be expecting a WAN connection time of between 50ms and 250ms (according to SAP note 203924).
I was seeing ~60ms in some cases.  So I can conclude that we have a moderately quick WAN setup.
The main point is that I was not seeing any packet loss.  Which is a good sign.

Whilst the immediate data is always good to see, it's worth mentioning that the speed could be circumstantial.  It's best to check the times in our ST03 statistics.  Opening up ST03 again and navigating to the "Workload Overview" analysis view, I can see on the "Times" tab, an "Average GUI Time per Dialog Step" of 90.4ms:

image


round trip


  GUI Time


GUI Time:  Measured at the Application Server, this is the average time taken for all requests (of a Dialog step) to go from the Application Server to the front-end SAP GUI (it does not include the time taken for the SAP GUI to send data to the Application Server).
The time I see is not too bad and will vary depending on the amount of data sent.

We can verify the value by checking the "Average Front-end Network Time" (per Dialog step):

image


  Front End Network Time


Front-end Network Time:  Measured at the SAP GUI, we see that the Front-end has recorded an average of 91.4ms time taken for the first request (of a Dialog step) to go from the Application Server to the front-end SAP GUI, plus the time taken for the last request (of a Dialog step) to go from the Application Server to the SAP GUI plus the processing required on the front-end for screen formatting.  This roughly agrees with our network finding of 60ms ping time (with no GUI processing at all), which means that on average, we're probably seeing ~30ms of time for the SAP GUI to perform it's rendering.

Based on the above findings I can rule out networking as being a probable cause to the high Roll Wait Time as it seems to be (on average) pretty normal.  Although I could recommend to my client that they use the "Low Speed Connection" setting in the SAP GUI as this is recommended by SAP in WAN setups (see SAP note 203924 section 4).  It's also possible I could recommend reverting to the Classic Theme in the GUI.  Also recommended in the mentioned note.

SAP note 62418 discusses the typical amount of data sent per Dialog step in SAP ECC 6.0.  Around 4.6KB is to be expected (per Dialog step).  It doesn't state if that is a normal connection setup or using the "Low Speed Connection" setup.
We can also look at the relationship of GUI Time to Roll Wait Time.  There's something there that doesn't look quite right.

If I go back to the Workload Overview, I can see that for DIALOG tasks, the "Average GUI Time" is 90.4ms, which should be almost equal to the "Roll-out" + "Roll Wait" + "Roll In" (these happen whilst the GUI is doing it's stuff - during RFC).

Except, in my case, I can see that the GUI (plus the time to talk to it) is doing it's stuff much quicker than the Application Server is rolling out and waiting (NOTE: We've had to drag the columns around in ST03 and we're on the "All Data" tab):

image


0.4 + 147.7 + 15.8 = 163.9ms.

This is 163.9 - 90.4 = 73.5ms slower (on average, per Dialog step) than I would have expected!
This is ~12% of the response time (the Roll Wait Time is ~26% of the response time).

These are the possible reasons I considered to explain this:
  • Bad network performance.  We've looked at this and we can see that GUI Time is actually pretty normal.  Why would waiting around take longer than the time to do something.  Network performance seems good.
  • Lack of dialog work processes.  No it can't be this, because this would not be attributable to Roll Wait Time, but instead would be measured as Dispatcher Queue Time (also known as "Wait Time" without the word "Roll").
  • Longer time to  Roll In and Roll Out.  It's possible that Roll In and Roll Out time could affect the calculations.  I'm seeing average Roll In time (per dialog step) of 15.8ms and Roll Out time of 0.4ms.  But this still doesn't add up to 73.5ms.
  • Time taken to initialise and open the RFC connection to the SAP GUI.  It's possible that the network lookup/hostname buffer is slow to get going before we manage to open the connection to the SAP GUI, but 75ms is pretty slow.
I needed to get out of the aggregated data and into the real nitty gritty.
I needed transaction STAD and some real live data.

Opening STAD, I left the default options with a read time of 10 minutes ago.
Once the records were displayed, I changed the layout of the screen to remove fields I wasn't interested in, I then added fields I was interested in:
- Started
- Server
- Transaction
- Program
- Type
- Screen
- Work Process
- User
- Response Time
- RFC+CPIC
- Roll Wait
- GUI Time

Once I had the screen setup, it was a simple case of scanning through the records looking for any that had "Roll Wait"> "GUI Time" with 0 "RFC+CPIC".

I found some!
One of the records (shown below) has a huge "Roll Wait" time of around 5 seconds, yet "GUI Time" is zero:

 image


It just so happens that at the same time as this STAD record was created, I was also running an Enqueue Trace, SQL Trace and Buffer Trace from transaction ST05 (yeah, it was a quiet day so I felt confident that dipping in and out of tracing wouldn't hurt performance too bad).

So I had an opportunity to display these traces for the same period of time (and the same Work Process number).
I found that there were no long durations in the entire trace.  In fact, the total duration of the trace period didn't add up to even 1 second.  What!

Sorting the trace by the "Time" column and manually scrolling through looking for the 5 seconds between one row and the next and sure enough, I found it.  The missing 5 seconds:

image

  image


I single clicked the first record after the time gap, and because it was an ABAP program, I was able to click the "Display Call Positions in ABAP Programs" button:

image


image


The source code position that I arrived at didn't seem to be doing anything other than an OPENSQL statement, so I scrolled back up the program a little.
Then I saw it.  The program was moving UNIX files around.  Not only that, but there was an ABAP command "WAIT UP TO 5 SECONDS.":

image

image

Here's what the ABAP syntax help says:
"This statement interrupts the execution of the program by the number of seconds specified... ... after the specified time has passed, the program continues with the statement following WAIT."
It also states:
"Each time the statement WAIT is used, a database commit is performed.  For this reason, WAIT must not be used between Open SQL statements that open or close a database cursor."

SUMMARY:

We've seen how "GUI Time" is measured and checked the network performance stats to see how accurate it is.
We've also learned how "GUI Time" is actually related in some ways to the value of the "Roll Wait Time".

It's been a long hard slog to get to the bottom of why I have a high average "Roll Wait Time" shown in ST03, when the average "GUI Time" is much lower.  A hardcoded ABAP statement was causing my work processes to WAIT for a fixed period of time, increasing the overall Response Time reported in ST03.  We referenced SAP note 606610 at the beginning of our article, but it seems very difficult to actually find out (without going through ABAP source) if a WAIT statement is has been the cause of Roll Wait Time.

We have subsequently learned that the ST03 Dialog Response Time measurements should be taken lightly, and that you should always try to exclude "GUI Time" by using the "Response Time Distribution" analysis view and clicking the "Dialog Without GUI Time" button.  This will exclude the "Roll Wait Time" as described in SAP note

ADDITIONAL NOTES FOR REFERENCE:
During my investigation, I also found a few other things.

We were actually suffering from the program error described in SAP note 1789729, so there is some potential to get better performance from the system by eliminating the additional database & buffer calls.

Some of the records in STAD contained HTTP records.
When I analysed these, I could see that certain document related transactions were calling out to the SAP HTTP Content Server to access documents.
I managed to calculate that the "Call Time" for the HTTP access, was recorded as "Processing Time" in the overall Response Time.
So, if you use a Content Server, be sure to check the response times, as this could also be a factor in slower response times, and this wasted time *is* recorded in the overall Response Time.
Obviously, using SSL will make this process slightly slower, so maybe some form of front-end cache would be better.

Thanks for reading.

HowTo: Install HANA Lifecycle Manager

$
0
0
Scenario:  You've followed my previous post on how to install a basic HANA DB and now you would like to install HANA LM into the same instance so that you can patch HANA and perform other LM tasks.

What you will need:
- A working HANA DB instance.
- The HANA installation media (usually an ISO file).

This will take less than 10 minutes if you already have the install media.
I have already converted the HANA install media DVD into an ISO file so that I can easily mount it as a virtual cdrom in a VM.
If you haven't converted the install media to an ISO, you can always upload the media directly, or if you're working in a VM, you can use the Shared Folders functionality to share a folder directly from the host O/S to the guest SUSE Linux VM.

Mount the ISO as a cdrom (I'm using a Virtual Machine as my SUSE HANA server).

On the cdrom you will have a directory containing the installation media for the HANA LM tool:

# cd /media/DATA_UNITS/SAPHANALM_LINUX_X86_64

As the root Linux user, run the installation tool (note: your HANA system can be shutdown during this process, especially if you have a low amount of memory):

# ./hdbinst

SAP HANA Lifecycle Manager installation kit detected.

SAP HANA Database Installation Manager - SAP HANA HLM Installation 1.50.00.000000
*********************************************************************************
Options:
  SAP HANA system ID | Description
  ---------------------------------------------------------
  H10                | SAP HANA Database H10 1.00.70.386119

Enter SAP HANA system ID [H10]:

Root user password. Mandatory for Distributed system with not configured Trusted SSH Connectivity, or                    else not applicable. [""]:

Root user SSH passphrase. Optional for Distributed system with configured Trusted SSH Connectivity, or                    else not applicable. [""]:

Checking installation...
Preparing package "SAP HANA lifecycle manager"...
Installing SAP HANA Lifecycle Manager to /hana/shared/H10/HLM...
Installing package 'SAP HANA lifecycle manager' ...

Installation takes approximately 10 minutes.

From HANA Studio, you can now open the Lifecycle Manager:

SAP HANA Lifecycle Manager install

HowTo: Disable HANA Web Dispatcher

$
0
0
Scenario: The SAP HANA Web Dispatcher seems to be automatically running in HANA 1.0 SPS70.  I am supposing that this is mainly for the XS-Engine.

If you have already removed the XS-Engine (see my post here), then you can also disable the Web Dispatcher as follows (this will save around 300MB of memory).

From HANA Studio, change the daemon.ini configuration parameter "sapwebdisp -> instances" to "0" for your host(s):

HANA sapwebdisp instances


Restart the HANA instance.
The Web Dispatcher process will be no longer present:

HANA processes no sapwebdisp


No more Web Dispatcher.

Why SAP Learning Hub Is Great Value

$
0
0
I run my own company and I am primarily a BASIS guy, secondarily a DBA.
With the launch of SAP HANA, the new In-Memory DB platform from SAP, I decided that I needed to prove my skills with this popular new technology before it takes off mainstream.

Being in control of my own training plan means that I can see the issues large companies have.
Budget!
Training is not cheap when you factor in the cost of the training, the overnight hotel stays, breakfasts, evening meals and car fuel, plus you have to pay the employee for the day.
Advances in technology mean you can now complete some training online.  This is obviously sacrificing the usual face-to-face interaction and dynamism you get in a classroom, but if you are a capable learner (you need a specific technique), then you can benefit from the flexibility of online learning.

SAP launched the Learning Hub primarily to provide a method of easily selecting and following a training plan or certification path.
Secondly, the Learning Hub provides the perfect place to manage and distribute online training content.

Let's get to the point:
How does it compare cost-wise with classroom based training?  Well here's how I worked it out:
(Certification C_HANATEC131 proposes that courses HA100 and HA200 should be completed.)

- SAP HANA HA100 classroom cost:  £1040.
- SAP HANA HA200 classroom cost:  £2600.
- Travel & overnight stay costs (for me): £120 per night = ~ £840
- C_HANATEC131 certificate exam: £350

TOTAL for classroom training & certification exam: £4830.

Compare the above total to the Learning Hub method that I used:

- 12 months subscription to SAP Learning Hub:  £2400.
(Courses HA100E and HA200R are in the catalogue)
- Travel & overnight stay costs (for me) one night: £120
- C_HANATEC131 certificate exam: £350

TOTAL for SAP Learning Hub online training & certification exam:  £2870.

As you can see, the Learning Hub route gave me much better value.

And the best is still to come...
With the 12 months subscription, I get access to ALL of SAP eLearning courses.
Not only can I now choose another set of courses, but should I decide to certify on another topic, I just need to pay for the exam and I've saved money yet again.

There must be a downside?
Not necessarily.  It does mean that you have to be a certain type of learner.  You need a learning technique that suits you and a method of time control that stops learning overload.
Being my own boss means I can take time to train in-between contracts, but you could also perform your training online in the evenings.

My method:
- Find the certification exam you want to complete, on http://training.sap.com

HANA Training - Find the certification exam on training.sap.com

- Expand the "Topic Areas" section on the page and you will see the topic course recommendations on the right.


HANA Training - Expand the topic areas

- Print the page with the areas expanded, so you can see the "%" of relevance to the certification exam.

- Use the Learning Hub to access the online version of the required/recommended training courses.
These usually also include a PDF document with the course content.
Don't forget to include the install and upgrade guides if relevant.

- Upload the PDF(s) to your tablet for a little easy reading when lounging (or when you have some dead time).

- Write key concepts on Post-It Notes and stick them somewhere you look at regularly (a wall maybe).



Don't move them once stuck, because this helps you visualise them in your mind whilst learning.
You'll come to know exactly on the wall where certain notes are.  That's because you've remembered them with the associated place on the wall.

- Write notes in a book or notepad as you go through the learning material.
Don't write long paragraphs and definitely copy down diagrams, it helps reinforce the picture contents.

- Review and revise often.
You don't need long, maybe 20 minutes.
Sometimes, just staring and going through the Post-It Notes will help them stick.

- Look up any acronyms you don't know.

- Don't be too concerned with the test exams, they are not very accurate or good quality.
When you think you're ready, book your exam.  You can always re-take it if you have been unsuccessful.

Good luck.

HowTo: Check HANA LM Is Running

$
0
0
Scenario: You want to check if the SAP HANA Lifecycle Manager is running/installed.

The SAP HANA Lifecycle Manager is installed separately the HANA DB and runs in its own Java VM.
It's installed by default into the "/usr/sap/hlm_bootstraps" directory and occupies ~700MB of disk space.

By default the HLM is not usually started with the instance.  It gets started when you call it from the HANA Studio, or if you manually start it from the Linux command line using the bootstrap-hlm.sh script located in "/usr/sap/hlm_bootstraps/<SID>/HLM".

From HANA Studio, right click the HANA instance as SYSTEM, then select "Lifecycle Management":

image


From the command line on the Linux server, as the <sid>adm Linux user:

> cd /usr/sap/hlm_bootstraps/H10/HLM
> ./bootstrap-hlm.sh

You will be dropped into the OSGI (Open Service Gateway Interface, see here: http://www.osgi.org/) command line.

HANA Studio - Diagnosis Mode Connection Overload

$
0
0
Be careful when using HANA Studio in Diagnosis Mode with the refresh interval set to a low value.
When set to 5 seconds (the default), the number of connections opened to the HANA DB is one every 5 seconds:

image

If you check the number of connections with a tool such as TCPView or Process Monitor, you will see a very high number of ESTABLISHED connections over time:

 image

Note that the HANA DB SQL port is 3<xx>15.

Under certain heavy network load, you could be causing more strain on your PC, the network and the HANA server.

Simply decrease the refresh time and this will allow your PC to close off the un-wanted connections in time to create the new ones, reducing your CPU consumption.

DBCLONE Is Still Running, Running & Running Running...

$
0
0
Scenario: Your running through an upgrade of SAP on Oracle, either applying an EHP or a support package stack update.  You're using the "Standard" downtime minimized approach and you've got to the SUM stage MAIN_SHDCRE/SUBMOD_SHDDBCLONE/DBCLONE and it has just bailed out!

During the DBCLONE step, a number of background jobs are created that copy certain tables, programs sources etc, from the current SAP database schema, to the shadow instance schema (on Oracle SAPSR3SHD).
The copies of the tables, sources etc, are placed into the new tablespace that you were asked to create in the earlier steps.

During this copy process, the database will be generating a lot of redo information (it is performing a lot of INSERTs).  This means that it will be generating a lot of archive logs also.  Most systems are in archive log mode by default, as this is the safest way of upgrading a production system.

The DBCLONE step can take a long time depending on a few factors:

  • Size of your SAP system master data.  Since the transactional data is not copied, most SAP systems will be roughly the same sort of size for the master data tables and sources etc (e.g tables D010TAB, D010INC, REPOSRC).  Don't forget, once tables are cloned, it needs to build the required indexes on those tables too!  Then it will gather stats on the tables and indexes.
  • Quality of your database.  If your Oracle database is highly fragmented, the indexes are not in good shape, or there is a lack of memory allocated to the database.

  • Redo disk write times.  The faster the write times for redo, the quicker this is going to go.
  • Number of parallelised jobs.  The SUM tool recommends 3 jobs in parallel.  Prior to this SUM step, you would have been asked to configure the number of parallel jobs (and also your number of background work processes).  If you configure less than 3, then it will take longer.  I would personally recommend to have n+3, where n= your normal production number of background work processes.  This means you will not be hampering day-to-day usage by blocking background jobs.  The 3 jobs are created with high priority (Class A) so they get all the background processing they need.
  • Whether you elected to pre-size the new shadow tablespace data files.
    Setting them to autoextend is fine, but by default, the SAP brspace commands create the files with only 200MB.  By setting these files to be as big as they need to be (no autoextend) then you will save time.
During the DBCLONE step, the SUM tool monitors progress by RFC connection into the SAP system.  It checks to see when the DBCLONE background jobs complete (and that they complete successfully).
If you have limited space available in your archive log area, and this fills up, then the RFC connection from SUM fails to work (archiver stuck issue).
This causes SUM to report that the step has aborted, but that DBCLONE was still running.

You will still see DBCLONE running in the background when you resolve the archiver stuck issue.
At this point, you could choose to manually cancel the jobs by "Cancel without Core" in SM50 for the busy background work processes where DBCLONE is running.  However, they are still running, and simply waiting until they have finished, then restarting SUM, will continue from where it left off.

It knows where it got to, because it records the list of tables cloned in the SAPSR3.PUTTB_SHD table by setting the CLONSTATE column to 'X'.
During the cloning process, the tables to be cloned are assigned to background jobs using the CLONSTATE column in the SAPSR3.PUTTB_SHD table.

You can monitor the cloning progress by using the following SQL:

SQL> select count(*) num, clonstate from SAPSR3.PUTTB_SHD group by clonstate;

You will notice that the CLONSTATE column will contain:
'X'  - Tables cloned.
'1'  - Tables on the work list of background job DBCLONE1.
'2'  - Tables on the work list of background job DBCLONE2.
'n'  - Tables on the work list of background job DBCLONEn.
'  '  - Tables not being cloned.

As tables are cloned, the CLONSTATE changes from 'n' to 'X'.
It seems that larger tables are performed first.

The method used to clone the tables is: "INSERT into <table> SELECT (<fields>) FROM <source>;".
Then a "CREATE INDEX" is performed.

It's also worth noting that you may need enough PSAPTEMP space to account for the index creation.
In a Solution Manager 7.1 SPS10 system, there are 13109 tables to be cloned.

As a final parting comment, if you have Oracle 11gR2, you should consider the database compression options available to you.  Reducing the I/O requirements will massively help speed up the process.

SAP DBCO Connection Without TNSNAMES

$
0
0
In order to create an external database connection to another database, so that an ABAP program can access it, you normally create the connection details in transaction DBCO.

However, if you use the ST04 transaction, it provides additional fields where you can enter the connection details such as Oracle listener port number, which will allow you to construct a connection string which will not require an entry in the server's TNSNAMES.ora file.

Solution Manager 7.01 MOPZ Stuck Calculating Selection

$
0
0
I had an issue with Solution Manager 7.01 SP24 where I had created a maintenance transaction for an SEM system (with a sidecar Java stack) and it got stuck in the "calculating" step when in the "Selection" stage.
It would just sit on the screen with the blue circular logo spinning and nothing happening.  It did not timeout and when I left it for a day, it was still not progressing.

So, I opened another one, and it got stuck at the same point:

image

I had made a change to the Java stack technical system in SMSY to indicate that the landscape pattern was "SIDECAR" as instructed by the SAP documentation, but this just didn't seem to be working for me.

So I removed the "SIDECAR" definition and now want to cancel the two transactions:

image


Following SAP note 1296589, I opened transaction "/TMWFLOW/MAINTENANCE" and entered in the two "open"  transaction IDs and clicked Execute:

image


image


The SAP note goes on to say:  "If any MOPZ planning procedure is displayed in the search result with User Status other than "New", then it's the locking planning procedure.".
So we can see that we have both transactions locking the planning procedure.  Woops!

Maintain the table TSOCM_COND_MAPP using SM30 (use a user other than DDIC for this!):

image


Find the line entry "SLMO  SLMO0001  E0002  40  SYSTEM_ASSIGNMENT...":

image


Change the column "MT" from "Cancel" to "Warning":

image


Save your change.  You will need to save the change to a transport request:

image


I then re-opened the maintenance transaction from SOLUTION_MANAGER and unfortunately it was still stuck on "Calculating...".
So, the next step was to try and remove the two transactions.
The SAP notes and SCN both suggested using report CRM_ORDER_DELETE.
From SE38 I ran the report and entered the first transaction ID number (from the maintenance optimizer screen) and "Business Transaction Type" of SLMO:

image


image


I then went back into the Maintenance Optimizer and click Refresh:

image


It's gone!  Only one to go:

image


After removing both old transactions, I went and re-modified the landscape pattern to un-link the Java stack from the ABAP stack (non-SIDECAR).

I then reset the change to the TSOCM_COND_MAPP table and saved it.
I was then able to create a new maintenance transaction and successfully calculate the stack.


Summary:
The SIDECAR landscape pattern in Solution Manager 7.01 SP24 doesn't seem to work as it should and causes issues with the Maintenance Optimizer.  For the time being, it might be easier to try and maintain the ABAP and Java stacks independently.

SAP HANA - SSL Security Essential

$
0
0
The HeartBleed hack exposed the consequences of security holes in software designed to provide encryption of network traffic.
However, this doesn't mean that all encryption software has holes and it's certainly better to have some form of encryption than none at all.

I've watched numerous online demos, official training videos and worked on real life HANA instances.  All of these systems so far, have not enabled SSL (now called TLS)  between the HANA Studio and the SAP Host Agent or the HANA Studio to the HANA database.
This means that specific communication between the HANA Studio, the SAP Host Agent and the HANA database indexserver, is not encrypted.

The HTTP protocol has been around for a long time now (thanks Tim).
It is inherently insecure when using HTTP BASIC authentication, since the username and password which is passed over HTTP to a server that has requested authentication, is sent in the clear (unencrypted) but encoded in BASE64.
The BASIC authentication is used to authenticate the HANA Studio with the SAP Host Agent.

What does this mean with regards to SAP HANA and the SAP HANA Studio?
Well, it means that any user with a network packet sniffer (such as Wireshark) could intercept one vital password, that of the <sid>adm SUSE Linux user.

In a SAP HANA system, the software is installed and owned by the <sid>adm Linux user.  Usually <sid> is a unique identifier for each HANA system in a SAP landscape.  As an example, H10 or HAN or any other 3 alphanumeric combination (within certain SAP restrictions) can be used.
When the HANA Studio is used to control the HANA database instance (start up and shutdown), the HANA Studio user is prompted to enter the username and password for the <sid>adm user.
This username and password is then sent via HTTP to the SAP Host Agent installed on the HANA server.  The SAP Host Agent uses the username and password to start or stop the HANA database instance.
If the password for the <sid>adm user is obtained, it is possible for a malicious user to establish an SSH connection directly to the SUSE Linux server where the HANA instance is installed, then control the instance, or access the database directly using a command line interface for executing SQL statements.

Here's a 6-step example which took me 10 minutes to setup, trace, collect the data and then login to the Linux server as an authorised user.

Step 1, Install and open Wireshark (on your PC) and start tracing for TCP connections to the HANA server on the Host Agent TCP port 5<xx>13.
Step 2, Launch HANA Studio (on your PC) and in the navigator right click and choose "Log On":

HANA  Logon without SSL

Step 3, If you haven't elected to save the username and password during previous use of the HANA Studio, you will be prompted.  Otherwise, the system will auto-logon to the Host Agent.
Step 4, Analyse the Wireshark capture.  You're looking for the text "Authorization: Basic" in the TCP packets:

HANA Logon Wireshark trace

The actual string will look something like: 
"Authorization: Basic aDEwYWRtOmhhbmFoYW5h"
I've copied an example HTTP POST out to a text editor for easy viewing:

HANA SAPControl HTTP POST

POST /SAPControl HTTP/1.1
Accept: text/xml, text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2
Authorization: Basic aDEwYWRtOmhhbmFoYW5h
Content-Type: text/xml; charset=utf-8
Cache-Control: no-cache
Pragma: no-cache
User-Agent: Java/1.7.0_45
Host: hana01.fqdn.corp:51013
Connection: keep-alive
Content-Length: 248

Step 5, Decode the username and password in the BASIC authentication string using a base64 decoder.  It's possible to use an online one:

HANA SAPControl HTTP POST BASE64 decoder

The output includes the username and password in the following format:
USERNAME:PASSWORD

Step 6, With our new found details, log onto the HANA server using an SSH terminal:

HANA Server Logon

From this point onward it's possible to access any data in the HANA database using command line tools.

SUMMARY:
You MUST enable SSL (TLS) encryption of the HTTP communications between the HANA Studio and the SAP Host Agent.  Without this, you might as well put the password on a post-it note on your screen.
See http://service.sap.com/sap/support/notes/1718944

Another option would be to segregate the HANA Studio users on their own vLAN, or to firewall the SAP HANA Host Agent and HANA database indexserver ports, tying them to specific user PCs only.
Incidentally, the password for the SYSTEM user of the HANA database, is encrypted with SHA256.  The encrypted string is then compared with the already encrypted password in the HANA database in order to authenticate a user.
However, if you have not enabled SSL between the HANA Studio and the HANA database indexserver, then all the of data retrieved from the database is sent in the clear.  You don't need to authenticate to the database if you can just read the network packets.  This is true of most database connections.

HowTo: SAP Kernel Patch History

$
0
0
Scenario: Your SAP system Kernel has been patched.
You would like to see the patch history of the system and you are running on Windows and SQL Server.

You can view the patch history for a DLL or EXEcutable (such as disp+work) by querying a table in the SQL Server database as follows (changing the <SID>):

SQL>  select * from <SID>.MSSDWDLLS
    where DLLNAME='disp+work.EXE'
order by DLLNAME, HOSTNAME, DLLPATH, LASTDATE, LASTTIME;


The results will provide a complete traceable history of the system including the previous identity of the SAP system, the different application instances and any inconsistencies in the DLL versions.

SAP Kernel Patch History SQL Server



What are you buying when you purchase an SAP HANA appliance?

$
0
0
I get asked this question from friends who work in IT:
"What are you buying when you purchase an SAP HANA appliance?"

My answer is:  "Time".

SQL Server Shrink Transaction Log Script

$
0
0
Below is a script that shrinks SQL Server transaction logs tested on SQL Server 2008R2.
Before running the script, you should ensure that you take a transaction log backup of your databases (which obviously means you should have already taken a full backup).  The transaction log backup will free the virtual logs from within the transaction log file.

The script simply tries to shrink the transaction log file by 25% for any databases that are not called "MASTER", "MODEL", "MSDB" or "TEMPDB".

If you wish to shrink logs by more than 25%, either change the script, or run it multiple times until it can't shrink the logs any further.

NOTE: When executing the script in SQL Management Studio, you should set the results output to "Text", so that you can see the output of the script.



USE master
GO

DECLARE @database_id    int,
        @database_name  nvarchar(128),
        @file_id        int,
        @file_name      nvarchar(128),
        @size_mb        int,
        @new_size       int;

DECLARE @dbcc_output    char(5);
DECLARE Cur_LogFiles CURSOR LOCAL FOR
  SELECT database_id,
         UPPER(DB_NAME(database_id)) database_name,
         file_id,
         name file_name,
         (size*8)/1024 size_mb
   FROM  master.sys.master_files
   WHERE type_desc = 'LOG'
     AND state_desc = 'ONLINE'
     AND UPPER(DB_NAME(database_id)) NOT IN ('MASTER','TEMPDB','MSDB','MODEL')
   ORDER BY size_mb DESC;

BEGIN
    OPEN Cur_LogFiles
    FETCH NEXT FROM Cur_LogFiles
       INTO @database_id, @database_name, @file_id, @file_name, @size_mb
    WHILE @@FETCH_STATUS = 0
    BEGIN
      -- Determine 25% of our current logfile size.
      SET @new_size = @size_mb*0.75;
      -- Set context to the database that owns the file and shrink the file with DBCC.
      PRINT 'Database: ' + @database_name + ', file: ' + @file_name + ', size: ' + CONVERT(varchar,@size_mb,25) + 'mb';
      PRINT 'Attempting to shrink file: ' + @file_name + ' by 25% to: '+ CONVERT(varchar,@new_size,25) + 'mb';

      EXEC ('USE [' + @database_name + ']; DBCC SHRINKFILE (['+@file_name+'],'+@new_size+');');

      FETCH NEXT FROM Cur_LogFiles
         INTO @database_id, @database_name, @file_id, @file_name, @size_mb
    END;

    CLOSE Cur_LogFiles
    DEALLOCATE Cur_LogFiles

END;







HowTo: Check SAP SUM Version Without Executing

$
0
0
Scenario: You have extracted, or previously used and subsequently found, a SUM directory on your SAP system.
This is production and you don't want to start SUM on this system.
You want to know what version of SUM it is.

You can check the manifest.mf file for the SUM version without needing to start SUM:

> cd <SUM PATH>\SUM

> more summanifest.mf
Manifest-Version: 1.0
keyname: SUM
keyvendor: sap.com
keylocation: SAP AG
os: NTAMD64
compilation mode: UNICODE
compiled for: 64 BIT
release: 1.0
support package: 7
patch number: 2

native branch: lmt_008
java branch: lmtj_008_REL
assembly time: 2013-05-13 05:10:16
pack version: 22
pack tool version: 1.042





You are interested in the lines:
"release", "support package" and "patch number".

The example above is therefore SUM 1.0 SP07 patch 2.

RMAN 10.2 Block Corruption Checking - Physical, Logicial or Both

$
0
0
It's an old topic, so I won't dwell on the actual requirements or the process.

However, what I was not certain about, was whether RMAN in 10.2 (10gR2) would perform both physical *and* logical corruption checking if you use the command:

RMAN> BACKUP VALIDATE CHECK LOGICAL DATABASE;

I kept finding various documents with wording like that found here: http://docs.oracle.com/cd/B19306_01/backup.102/b14191/rcmbackp.htm#i1006353
"For example, you can validate that all database files and archived redo logs can be backed up by running a command as follows:
RMAN> BACKUP VALIDATE DATABASE ARCHIVELOG ALL;

This form of the command would check for physical corruption. To check for logical corruption,
RMAN> BACKUP VALIDATE CHECK LOGICAL DATABASE ARCHIVELOG ALL;"


It took a while, but I found the original document from Oracle here: http://docs.oracle.com/cd/B19306_01/backup.102/b14191/rcmconc1.htm#i1008614

Right at the bottom, it confirms that ordinarily "BACKUP VALIDATE DATABASE;" would check for physical corruption.
The additional keywords "CHECK LOGICAL" will check for logical corruption *in addition* to physical corruption.

So RMAN doesn't need running twice with each validate command combination.

SAP HANA - Migrate Statistics Server 1917938

$
0
0
Since SAP note "1917938 - Migration of statistics server with upgrade to SPS 7" seems to be going missing rather a lot, I've noted the content here for reference based on v10 05-05-2014.

If you do not monitor or administrate the upgraded SAP HANA database with the DBA Cockpit or Solution Manager, you can activate the new statistics server. If the DBA Cockpit or Solution Manager is active, you are only allowed to activate the new statistics server if you observe SAP Note 1925684.

A configuration change is required to activate the new statistics server:

nameserver.ini -> [statisticsserver]->active=true

The data held in the persistence of the statistics server is now transferred to the persistence of the master index server. At the end of the migration, the statistics server is automatically stopped and removed from the database configuration (topology). The functions of the statistics server are distributed to other services.

The migration of the statistics server is carried out without interrupting the backup history, which means that data and log backups created before the migration can still be used to restore the SAP HANA database.

The HANA instance must not be restarted during the migration.
The migration is completed when no statisticsserver process is running in the HANA instance.
It is not necessary to restart the HANA instance following the migration.

HowTo: Using DBMS_STATS to Restore Stats

$
0
0
Scenario: You're about to implement some major changes to the Oracle database that will adjust the database statistics.
Oracle provide the DBMS_STATS package to help administer stats.  The package includes some procedures that can be used to export/import stats, plus restore them to a previous point in time.

When statistics are updated using DBMS_STATS.GATHER_*_STATS, it saves the previous version in the database (can be changed with DBMS_STATS.ALTER_STATS_HISTORY_RETENTION procedure).  Also, see table DBA_TAB_STATS_HISTORY.

These versions are retained for a specific retention period, which you can check using the GET_STATS_HISTORY_RETENTION procedure:

SQL> set serveroutput on
SQL> DECLARE
v number;
BEGIN
v := DBMS_STATS.GET_STATS_HISTORY_RETENTION;
DBMS_OUTPUT.PUT_LINE('Stats history retention: ' || v || ' days.');
END;
/

Stats history retention x days.

PL/SQL procedure successfully completed.

You can also check the date of the oldest stats history:

SQL> set serveroutput on
SQL> DECLARE
v timestamp;
BEGIN
v := DBMS_STATS.GET_STATS_HISTORY_AVAILABILITY;
DBMS_OUTPUT.PUT_LINE('Oldest stats history: ' || v);
END;
/

Oldest stats history: 15-DEC-13 11.29.32.143688 PM

PL/SQL procedure successfully completed

To restore the statistics you can use one of the relevant procedures:

DBMS_STATS.RESTORE_DICTIONARY_STATS
DBMS_STATS.RESTORE_FIXED_OBJECT_STATS
DBMS_STATS.RESTORE_SCHEMA_STATS
DBMS_STATS.RESTORE_SYSTEM_STATS
DBMS_STATS.RESTORE_TABLE_STATS

See here for parameters:
http://docs.oracle.com/cd/E18283_01/appdev.112/e16760/d_stats.htm#insertedID2
As an example, the RESTORE_SCHEMA_STATS procedure takes the following parameters:

ownname   Schema owner,
timestamp   Timestamp,
force   TRUE|FALSE   Restore even if stats are locked, default TRUE,
no_invalidate   TRUE|FALSE   Invalidate cursors, default get_param('NO_INVALIDATE').

If the stats are restored to a specific timestamp, it means that whatever statistics values were applicable to a specific table at a specific point in time, are applied to the tables.  If the table's statistics are not changed then there will be gaps in the history.
You can imagine this being like a roll-forward through the DBA_TAB_STATS_HISTORY table, until the timestamp specified.

WARNING: If the table's statistics are not changed then there will be gaps in the history.  In which case, you may not be able to restore previous statistics if the table stats have not changed within the last history window (default 31 days).

Some great examples are here: http://www.morganslibrary.org/reference/pkgs/dbms_stats.html

You should also note, that under an SAP system, the Oracle stats gatherer is called by BR*Connect, and note that it calls the GATHER_TABLE_STATS procedure for each table that is mentioned in table DBSTATC for tables that have stats enabled in DBSTATC.
If the table is not enabled to collect stats, then it may have stats delivered by SAP (see SAP note 1020260), in which case, there may not be any history.

Also see my blog post on SAP statistics and DBSTATC.

HANA Memory Allocation Limit Change With hdbcons for HANA

$
0
0
The SAP HANA command line utility hdbcons can be used to administer the HANA system directly from the Linux operating system command line.

It's a very powerful utility and I'm sure as time goes by, it will provide invaluable information and functionality for HANA DB administrators to help fine-tune the database.

Log in as the <sid>adm Linux user and run the hdbcons command.
It will automatically connect the indexserver for that instance (if it's running).
You can then list the available server commands:

hana01:/usr/sap/H10/HDB10> hdbcons
SAP HANA DB Management Client Console (type '\?' to get help for client commands)
Try to open connection to server process 'hdbindexserver' on system 'H10', instance '10'
SAP HANA DB Management Server Console (type 'help' to get help for server commands)
Executable: hdbindexserver (PID: 4092)
[OK]
--
>
> help

Available commands:
ae_checksize - Check and Recalculate size of columns within the Column Store
authentication - Authentication management.
bye - Exit console client
cd - ContainerDirectory management
checktopic - CheckTopic management
cnd - ContainerNameDirectory management
context - Execution context management (i.e., threads)
converter - Converter management
crash - Crash management
crypto - Cryptography management (SSL/SAML/X509).
deadlockdetector - Deadlock detector.
debug - Debug management
distribute - Handling distributed systems
dvol - DataVolume management
ELF - ELF symbol resolution management
encryption - Persistence encryption management
event - Event management
exit - Exit console client
flightrecorder - Flight Recorder
help - Display help for a command or command list
log - Show information about logger and manipulate logger
mm - Memory management
monitor - Monitor view command
mproxy - Malloc proxy management
mutex - Mutex management
output - Command for managing output from the hdbcons
pageaccess - PageAccess management
page - Page management
pcm - Performance Counter Monitor management
profiler - Profiler
quit - Exit console client
replication - Monitor data and log replication
resman - ResourceManager management
runtimedump - Generate a runtime dump.
savepoint - Savepoint management
snapshot - Snapshot management
statisticsservercontroller - StatisticsServer internals
statreg - Statistics registry command
stat - Statistics management
tablepreload - Manage and monitor table preload
tracetopic - TraceTopic management
trace - Trace management
version - Version management
vf - VirtualFile management

For each of the above commands, additional help can be obtained by re-issuing "help" plus the command name e.g.:

> help mm

Synopsis: mm <subcommand> [options]: Memory management Available subcommands:
   - list|l [-spf] [-l <level>] [<allocator_name>]: List allocators recursively
      -C: work on composite allocators
      -s: Print some allocator statistics (use stat command for full stats)
      -p: Print also peak usage statistics
      -f: Print allocator flags
      -S: Sort output descending by total size
      -l <level>: Print at most <level> levels
   - flag|f [-C] <allocator_name> [-rsdt] <flag_set>: Set allocator flags
      -C: work on composite allocators
      -r: Set flags recursively
      -s: Set given flag(s), default
      -d: Delete given flag(s)
      -t: Toggle given flag(s)
      -a: Also apply changes to associated composite allocators (not allowed in context with '-C')
   - info|i [-f] <address>: Show block information for a given address
      -f: Force block info, even if block not known
   - blocklist|bl [-rtsS] <allocator_name> [-l <count>]: List of allocated blocks in an allocator
      -C: work on composite allocators
      -r: Show blocks in sub-allocators recursively
      -s: Show also allocation stack traces, if known. Cannot be combined with optiont '-t'.
      -S: Show only blocks with known allocation stack traces. Cannot be combined with optiont '-t'.
      -t: Show top allocation locations sorted by used size (descending).
            Cannot be combined with options '-s' or '-S'.
            The default number of printed locations is 10, this can be changed by '-l' option.
      -l <count>: Limit to <count> locations. Valid only if combined with option '-t'. Unlimited in cas <count> = 0.
   - maplist|ml: List all mappings used by the allocator
   - mapcheck|mc [-sln]: Check all mappings used by the allocator
      -s: Also show all system mappings
      -l: Also show own mappings as in maplist
      -n: Suppress output of known alloc stack traces for unaccounted memory
   - mapexec|me [-u]: List all known executable module mappings
      -u: Update list of known executable modules
   - reserveallocator|reserveallocators: Print information about OOM Reserve Allocators
   - top [-C] [-sctr] [-l <count>] [-o <fn>] [<allocator_name>]: List top users of memory, <allocator_name> is needed except for option '-t'
      -C: work on composite allocators
      -s: Sort call stacks by used size (default)
      -c: Sort call stacks by call count
      -t: Throughput, sort nodes by sum of all allocations
      -r: work recursively also for all suballocators
      -l <count>: Only output top <count> stack traces (0=unlimited)
      -o <fn>: Specify output file name
   - callgraph|cg [-sctrk] [-l <count>] [-o <fn>] [<allocator_name>]: Generate call graph, <allocator_name> is needed except for option '-t'
      -C: work on composite allocators
      -s: Sort nodes by used size (default)
      -k: Ouput in kcachegrind format
      -c: Sort nodes by call count
      -t: Throughput, sort nodes by sum of all allocations
      -r: Work recursively also for all suballocators
      -d: Do full demangle (also parameters and template arguments)
      -l <count>: Only output top <count> functions (0=unlimited)
      -o <fn>: Specify output file name (for .dot graph)
   - resetusage|ru [-r] [<allocator_name>]: Reset call stack usage counters, <allocator_name> is needed except for option '-r' or '-C'
      -C: work on composite allocators
      -r: Work recursively also for all suballocators
   - limit [-s <size>[K|M|G|T]]: Get current allocation limit in bytes
      -s <size>[K|M|G|T]: Set current allocation limit in bytes, KB, MB or GB
   - globallimit [-s <size>[K|M|G|T]]: get current global (if IPMM is active) allocation limit in bytes
      -s <size>[K|M|G|T]: Set current global (if IPMM is active) allocation limit in bytes, KB, MB or GB
   - garbagecollector|garbagecollection|gc [-f]: Return free segments/blocks to
     operating system
      -f: Also return free fragments in big blocks
   - ipmm: Print Inter Process Memory Management information
      -d: Print detailed information.
   - compactioninfo, ci: Print information about last compaction
   - virtual: Print information about virtual but not resident memory (linux only)
   - requested: Print information about requested allocations (reporting no overhead at all), iterates over all instances of ReportRequestedAllocators
   - blockedmemory [-s <size>[K|M|G|T]]: Get current blocked memory.
      -s <size>[K|M|G|T]: Set current blocked memory in bytes, KB, MB or GB and try to reserve this memory. Common options and arguments:
   - <allocator_name>: Name of the allocator to address
   - <flag_set>: Comma-separated list of following flags: ffence (fence front, writes the pattern 0xaaccaacc in front of the allocated block),
     bfence (fence back, writes the pattern 0xaaccaacc behind the allocated block), astrace (stack trace at allocation),
     dstrace (stack trace at deallocation), areset (overwrite at allocate with pattern 0xd00ffeed),
     dreset (overwrite at deallocate with pattern 0xdeadbeef), all, none, default, !emptyok (allow
     non-empty destruction), preventcheck (prevent changing check flags)
     atrace (trace at allocation), dtrace (trace at deallocation),
     malf (malfunction) or their 2-letter shortcuts [OK]
--
>

As an example, the current global allocation limit can be displayed:

> mm globallimit
Current global allocation limit=15032385536B.
[OK]
--

We can adjust the global allocation limit online by issuing the additional "-s" parameter:

> mm globallimit -s 16G
Current global (if IPMM active) allocation limit: 17179869184B
[OK]
--
>

Now re-check:

mm globallimit
Current global allocation limit=17179869184B.
[OK]
--
>

What's the current global allocation limit  in the global.ini you might ask?

hana01:/usr/sap/H10/HDB10> grep '^global'  /usr/sap/H10/SYS/global/hdb/custom/config/global.ini

global_allocation_limit = 14336

It hasn't changed.

So we have confirmed that we can affect the configuration of the HANA system in real-time using hdbcons, but we don't necessarily preserve the configuration.
You can also check in HANA Studio on the landscape page.
Since each service (process) takes it's memory allocation percentage from the global allocation, this will automatically change in real-time too.
This means that for analysing "what-if" style scenarios or for operational acceptance testing, you can effectively present a set of configuration values and work through them almost automatically.  Invaluable for those test automation guys in the field.


Native or Direct HANA Connection

$
0
0
When using the SAP HANA Studio, you can configure the connection to the HANA DB as NATIVE or DIRECT (from Window -> Preferences -> General -> Network Connections).
When NATIVE is selected, this means that the internet connection details of the native Operating System are used.

In Windows this is usually the same as the "Internet" settings configured through Windows Control Panel, or through Internet Explorer's options menu.

In this setting, HANA Studio will utilise whatever proxy server you have configured for internet access.
When DIRECT is selected, any proxy settings are controlled from within HANA Studio and any Operating System settings are ignored.

This is the reason that SAP say to select DIRECT whenever you experience connectivity issues.
Are there any performance benefits from DIRECT?   maybe.  Since the HANA Studio doesn't need to query for any proxy server settings in the Windows registry, it's possible that the initial connection could be established quicker.  Once connected it probably doesn't make any performance improvement over NATIVE.

HANA - Emergency Shutdown

$
0
0
For testing purposes, you can perform an emergency shutdown of the HANA indexserver forcing it to simply crash out.
This will require a DB recovery (automatic) upon restart!

Use the hdbcons utility (as <sid>adm) to first activate the crash functionality, then invoke the crash:

> hdbcons
> crash activate
> crash emergencyshutdown

Inside the indexserver trace file (from diagnosis view in HANA Studio), you will be able to see the following text towards the end of the file "ExternalCommandHandler.cpp (00958)  NOTE: INTENTIONAL CRASH: errorExitWithoutCrashdump".

Once the restart is complete, you can verify that recovery was required from the indexserver trace file:

"LoggerImpl.cpp(00933) : Replayed 130944B (0MB) of log in 17.4729 seconds; 0.00714693MB/s; max known TID=50175"

Viewing all 141 articles
Browse latest View live