Search This Blog

Saturday, October 25, 2008

A significant part of sql server process memory has been paged out.

First of all :
There is no solution to this issue in Windows 2003 (You might have lost the confidence after reading this :) ...read it fully).


This is the sample error that we get :
2007-01-23 16:30:10.14 spid1s A significant part of sql server process memory has been paged out. This may result in a performance degradation. Duration: 0 seconds. Working set (KB): 1086400, committed (KB): 2160928, memory utilization: 50%.

Lets try to understand this error first :

Duration : the time or interval or gap after which this error has been reported. Here 0 means that this is the first time on that day , that this error occured.You will see a fixed pattern of duration .

Working set :The memory pages kept in the RAM for the use of process/threads.Remember that this is set approximately by the OS and its the OS that can trim it or add to it if it see that too many pages ar ein RAM for a process and it needs to give minimum bare pages to some other process which it cannot .In our case its roughly 1 GB . Check wikipedia for a better definition (http://en.wikipedia.org/wiki/Working_set)

Committed : The actual memory committed by SQL Server .NO one should be able to take this memory from SQL Server or deallocate it .Not even the OS .But it can be paged :) by the OS and thats what it does.

Memory Utilization : the actual memory utilization by SQL Server after its working set is paged out .In this case its 50 which means the remaining 50 has been paged out .Had it been 2% then the remaining 98% is paged out .This error will start apperaing once the working set of SQL Server trims equal to or more than 50%

Now , one can easily deduce that 2GB RAM was committed [see commited in the error] but only 1 GB was the working set (see the error again) .So 50% is the current working set as compared to its committed value .Remaining 50% in in the disk and hence SQL Server will have to go for Hard page fault in case the page its looking for is on the disk .

Many a times you will see that on a cluster it will simply failover .This is because of the same effect .


Why it happens ??

Every process needs A contiguous chunk of memory but memory is largely fragmented ...
Some EXEs needs large chunks of contiguous memory .So even if there is enough memory available , its fragmented .And thats the cause of this issue .

When troubleshooting these type of issues, the key is to find out why the physical memory on the server gets exhausted and creates a situation that forces sql server to scale its memory usage back. As long as the available memory is below the limits for which Windows keeps sending low memory notification, sql server has no option but to scale itself back to avoid poor performance.



More on the dynamic memory management capabilities of sql server is discussed in the topic http://msdn2.microsoft.com/en-us/library/ms178145.aspx

There are a few situations where SQL Server Process working set might be paged out by Windows despite these memory resource notification mechanism.


-> If windows is not sending the correct notifications to all listening processes at the right moment and thresholds.

-> If SQL Server is not responding fast enough to the low memory resource notification from Windows.

-> Conditions in Windows where working sets of all processes are trimmed.

-> Windows might decide to trim a certain percentage of working set of various or specific processes.

-> Incorrect usage of memory management functions by Device Drivers.

SQL Server memory management is designed to dynamically adjust the committed memory based on the amount of available memory on the system. SQL Server does this to avoid being paged out by the OS memory manager with the help of some APIs.

SQL Server registers with the memory resource notifications of Windows. If a low memory notification comes from Windows, sql server scales its memory usage down. When windows sends the high memory notification, sql server can grow its memory usage target. The Win32 API's being used for these are documented in http://msdn.microsoft.com/library/default.asp?url=/library/en-us/memory/base/creatememoryresourcenotification.asp

Solutions
This issue can be fake or real .Do not panic if that workign ser is low , say in KBs or you get these errors when SQL Server starts up .

But if its the other way out then you can go through options i have mentioned below .If eventually nothing works , then you need to dig out who on the server needs so much of memory that makes OS to trim SQL Server's working set .

KNOWN ISSUES[Please go through each of it and apply as relevant]:



Microsoft has found some known issues that may cause the SQL Server 2005 process working set to be trimmed.



Issue1:



For more information, click the following article number to view the article in the Microsoft Knowledge Base:

905865 The sizes of the working sets of all the processes in a console session may be trimmed when you use Terminal Services to log on to or log off from a computer that is running Windows Server 2003



Issue2:



Computers that are running Windows Server 2003 can be too aggressive when they cache dirty buffers if there is an application performing buffered I/O, such as a file copy operation. This behavior can cause the working set in SQL Server to be trimmed. For more information, click the following article number to view the article in the Microsoft Knowledge Base:



920739 You may experience a decrease in overall system performance when you are copying files that are larger than approximately 500 MB in Windows Server 2003 Service Pack 1 or Service Pack 2



Issue3:



On a computer that is running SQL Server, applications may use the system cache too much. Therefore, the operating system will trim the working set of SQL Server or of other applications. If you notice that the application uses the system cache too much, you can use some memory management functions in the application. These functions control the system cache space that file IO operations can use in the application. For example, you can use the SetSystemFileCacheSize function and the GetSystemFileCacheSize function to control the system cache space that file IO operations can use.



Issue4:



Additionally, third-party device drivers that use the MmAllocateContiguousMemory function and specify the value of the HighestAcceptableAddress parameter to less than 4 gigabytes (GB) may also cause the SQL Server 2005 64-bit working set to be trimmed.



Currently, the iLO Management Channel Interface Driver (Cpqcidrv.sys) from Hewlett-Packard is known to cause this issue on x64 editions of SQL Server 2005.

The integrated Lights-Out Management (iLO) Interface Driver update for Windows Server 2003 is available at the following Hewlett-Packard Web site:

http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?lang=en&cc=us&objectID=c00688313&jumpid=reg_R1002_USEN

The problem is fixed in this driver update.



Issue5:



Broadcom driver bxvbda.sys caused memory trimming via MmAllocateContiguousMemory()

This driver is in Windows 2003 SP2, and OEM'ed from DELL and HP, but originally from Broadcom.



Disabling the advanced functionality from the driver may fully resolve the issue. In order to do that you have to do 2 things:

- On Windows 2003 Sp1, apply the hotfix download from http://support.microsoft.com/default.aspx?scid=kb;EN-US;936594 and perform the below mentioned registry changes. On Windows 2003 Sp2, just perform the below mentioned registry changes.

- Disable TCP Chimney, TCPA, and RSS using the registry entries

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters]

"DisableTaskOffload"=dword:00000001

"EnableTCPA"=dword:00000000

"EnableRSS"=dword:00000000

"EnableTCPChimney"=dword:00000000



TCP Chimney is a stateful offload. TCP Chimney offload enables TCP/IP processing to be offloaded to network adapters that can handle the TCP/IP processing in hardware. Each TCP Chimney offload-capable network adapter has a finite number of connections that it can support in hardware. TCP connections will be offloaded to hardware as long as the hardware can support these connections. After the hardware offload connection limit has been reached, all additional connections are handled by the host stack.

Side effects: the NIC will actually function better



Article on the affect of the setting: http://support.microsoft.com/default.aspx?scid=kb;EN-US;912222

Article on how this setting affects SQL operations: http://support.microsoft.com/default.aspx?scid=kb;EN-US;918483



Issue 6:

938486 A Windows Server 2003-based computer becomes unresponsive because of a memory manager trimming operation that is caused by an indeterminate module that requests lots of memory

http://support.microsoft.com/default.aspx?scid=kb;EN-US;938486

This kernel hotfix limits the number of trim operation per process to 8,192 pages. Previously, the memory manager would try to trim one-quarter of the working set size of a process. This can help to a certain extent on some of the processes being paged out.





ACTION PLAN that you can follow :

1. Cap SQL Server Max server memory to 75% of RAM and min to say , 1 GB .
2. Increasing the LowMemoryThreshold value.
Set the LowMemoryThreshold value (in MB), so the OS will notify applications such as SQL on low memory conditions much earlier.

In Regedit -> go to

HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\SessionManager\MemoryManagement

Right click on the right pane,

Select New -> select click DWORD Value -> enter LowMemoryThreshold

Double Click LowMemoryThreshold -> value (choose decimal) -> 650

Reboot required to take effect.

Default values per MSDN:

“The default level of available memory that signals a low-memory-resource notification event is approximately 32 MB per 4 GB, to a maximum of 64 MB. The default level that signals a high-memory-resource notification event is three times the default low-memory value.”



3. Disable TCP Chimney, TCPA, and RSS using the registry entries [mentioned above in known issues # 5]

4.Reboot the server for the registry values to take effect .

5. If needed be involve the OS(platforms) team to find out if there is any memory leak (chance of this is very less, but still) .



References :
http://support.microsoft.com/kb/918483

GOOD NEWS :
Lock Pages in memory option is available in SQL Server STD edition from SQL Server 2005 SP3 CU4 and SQL Server 2008 SP1 CU2 onwards .

CAUTION:
LPIM should be the last resort that you should opt for .In worst case if you have to opt for it ,make sure you cap SQL Server Max server memory to 75% of RAm or might be less .


Happy Learning !!!

3 comments:

kbuck said...

When you start a VM in Virtual Server 2005, and sometimes just at random on a host with at least one VM running, all processes on the host will have their working set trimmed.

I finally found a solution to this, it was causing me a major headache for some time.

The solution is to run Virtual Server and Virtual Machine Helper under a different user account rather than NETWORK SERVICE / LOCAL SYSTEM.

I tried it with a limited user account and set permissions in various folders without success (VS would start and stop immediately). Eventually I just tried with a local admin account and it worked.

Now when I start a virtual machine...NO WORKING SET TRIM!!!

Thanks Microsoft. Another great reason to continue to be married to your wonderful products.

Abhay said...

You said that all the processes were getting their Working set trimmed ...How did you find that all the processes were getting their working set trimmed ?? Can you also check wha tall accounts are included in "Lock pages in Memory " policy (Start >> Run >> Secpol.msc >> User rights assignments)

Abhay said...

Further the reason you dont have the workign set trimming any more is because of the local system account that you are using ...by default this account will be a part of LPIM and hence SQL will not sacrifice its working set nor OS will ask it to do ...so indirectly you are doing the same ...adding the SQL Server account to LPIM policy
HTH
Abhay