返回列表 发帖

[翻译]API Spying Techniques for Windows 9x, NT and 2000

原文:
                API Spying Techniques for Windows 9x, NT and 2000
                                                ---------Yariv Kaplan
API spying utilities are among the most powerful tools for exploring the inner structure of applications and operating systems. Unfortunately, neither the SDK nor the DDK provide any documentation or examples demonstrating a way for implementing such a utility. This article will attempt to shed some light on this subject by presenting several techniques for hooking API calls issued by Windows applications.
There are several things you';ll need to consider before sitting down to write your own API spying utility. First, you';ll need to decide whether you want to spy upon a single application or set up a system-wide API interceptor. Each approach can be useful in different situations. For example, assume that you need to write an application that blocks the execution of specific processes according to rules set by an administrator. Obviously, you will need a way to monitor the execution of new processes and terminate the ones, which are marked as restricted. One way to accomplish that would be to establish a system-wide API interceptor that monitors calls made to the CreateProcess function (actually, to both the Ansi and Unicode versions of this function). Whenever an application issues a call to one of these functions in order to create a new process, your interceptor will gain control and perform whatever processing is necessary.
Other types of applications might require a simpler API interceptor capable of monitoring just one application at a time. A good example is BoundsChecker from NuMega - a tool capable of analyzing API calls in order to detect memory leaks and other bugs lurking inside Windows applications.
Whether you decide to employ a system-wide solution or opt for a simpler one, you';ll still have to choose one among several API hooking techniques. The following sections will explore the various ways available for implementing an API interceptor for Windows, focusing on the advantages and disadvantages of each technique.
Proxy DLL
This is by far the easiest technique for hooking API calls under Windows. As an example for a use of this technique, consider an anti-virus application that scans incoming email messages for viruses. An obvious requirement for such an application is the ability to hook Winsock';s I/O functions in order to analyze data transfers between email clients and remote mail servers.
This can be easily accomplished by creating a proxy DLL, which contains a stub for each of the functions exported from the Winsock library. If the proxy DLL is named exactly like the original Winsock library (i.e. wsock32.dll) and placed in the same directory where the target email application resides, then the interception occurs automatically. When the target application attempts to load the original Winsock library, the proxy DLL is loaded instead. All the calls made to Winsock';s functions are routed to the exported stubs in the proxy. After performing the necessary processing, the proxy DLL simply routes the calls to Winsock and returns control back to its caller.
Lest you be afraid that the above method of API interception is too simple, there is a catch. Although simple to implement, this technique has one major drawback - hooking a single function located in a DLL that exports 200 functions would require creating a stub for each of these functions in your proxy DLL. This could be rather tedious and also impossible at times, when some of the functions are not fully documented.
If you wish to see a working example of this technique, consult Matt Pietrek';s WinInet utility, which was published on the Dec 94 issue of MSJ.
Patch those calls
When thinking about ways for intercepting an API call, there are two locations where you can intervene - either at the source of the call (the application code) or at the destination (the target function). This technique relies on the first option.
For each API function that you wish to intercept, you patch all the locations in the target application where calls to this function are issued. The modification can be done either on disk (to the executable file itself) or in memory (after the executable is already loaded). The tough part is pin pointing the exact locations where patching is necessary. In order to accomplish that, you';ll need to implement a disassembler capable of analyzing assembly instructions. Obviously, writing a disassembler is far from being a trivial task, making this API interception technique one of the least popular among the group.
IAT Patching
If you have ever looked into API hooking before, then you probably heard about Import Address Table patching. The numerous advantages of this technique make it the most elegant and common way of hooking API functions under Windows.
The foundation of this technique relies on the fact that 32-bit Windows executable files and DLLs are built upon the Portable Executable (PE) file format. Files based on this specification are composed of several logical chunks known as sections. Each section contains a specific type of content. For example, the .text section holds the compiled code of the application while the .rsrc section serves as a repository for resources such as dialog boxes, bitmaps and toolbars.
Among all of the sections present in a Windows executable file, the .idata section is particularly useful for those who wish to implement an API interceptor. A special table located in this section (known as the Import Address Table) holds file-relative offsets to the names of imported functions referenced by the executable';s code. Whenever Windows loads an executable into memory, it patches these offsets with the correct addresses of the imported functions.
Why does Windows go into the trouble of patching the IAT in the first place? Well, as you probably know, Windows executable files and DLLs are often relocated (due to collisions) after they are loaded into memory. This makes it impossible to set in advance the target addresses of calls made to imported functions in the executable code. In order to ensure that these calls successfully reach their destination, it would have been necessary for Windows to locate and patch every single call made to an imported function after an executable image is loaded into memory. Obviously, such a large amount of processing during initializing of new processes and DLLs would have slowed down the system, giving the user the notion as if Windows is a slow and unresponsive operating system ;-).
Fortunately, the designers of Windows were quite resourceful when they addressed this issue. In the current implementation of Windows executables and DLLs, calls made to imported functions are routed through the IAT using an indirect JMP instruction. The fact that imported function calls are "drained" through one location saves Windows the trouble of traversing the executable image in memory, looking for call instructions that are destined for patching.
What all of this got to do with API spying? Well, it seems that Windows IAT redirection mechanism offers a perfect way for intercepting API calls. By overwriting a specific IAT entry with the address of a logging routine, an API interceptor can gain control before the original function gets a chance to be executed by the processor.
Obviously, there are other issues involved in the implementation of this technique, such as the requirement for the logging code to be executed in the memory context of the intercepted application. These issues are discussed in the following resources:

Matt Pietrek';s excellent book: "Windows 95 System Programming Secrets" contains the source code of an API interception utility called APISpy32. This utility was initially published as part of an article written by Matt for the Dec 1995 issue of MSJ. A newer version of APISpy32 is available and can be ordered through Matt';s web site at http://www.wheaty.net.
Spying on COM Objects by Dmitri Leman, WDJ, July 1999.
Patch the API
This is my favorite technique for hooking API functions. It has the inherent advantage of being able to trace API calls issued from different parts of an application while requiring a modification only to a single location - the API function itself.
There are several approaches that can be used here. One option is to replace the first byte of target API with a breakpoint instruction (Int 3). Any call issued to that function would generate an exception, which would be reported to your API interceptor in case it serves as a debugger of the target process. Unfortunately, there are several problems with this approach. First, the poor performance of Windows exception handling mechanism would considerably slow down the system. A second problem is related to the implementation of Windows debugging API. As soon as a debugger shuts down, it terminates all the applications that were under its control. Obviously, such a behavior is completely unacceptable in case you';re implementing a system-wide interceptor, which must be able to terminate itself before its target applications cease executing.
Another possibility is to patch the target function with one of the control-transfer instructions of the CPU (i.e. either a CALL or a JMP). Once again, there are several problems with this approach. First, it is possible that the patching would overrun the end of the intercepted function. This can occur in cases where the target API is shorter than 5 bytes (CALL and JMP are each 5 bytes long). Another issue is the need to constantly switch between the patched and "unpatched" versions of the intercepted function. This means that once your logging routine receives control from the CPU, it must restore the intercepted function to its previous unhooked state. This is required to allow the API interceptor to route the call to the original function without generating an infinite loop of calls back to the logging routine. Note that during the time the CPU is executing the original function, other calls to that function might be issued from different parts of the system. Since the function is in the unhooked state at that stage, the API interceptor will miss those calls. A more sophisticated API interceptor might utilize a better technique in order to overcome this limitation. Take a look at the following figure to get a better idea how this could be accomplished.

Figure 1 - The interception process
In this case, the API interceptor places a JMP instruction at the beginning of the target function, but not before saving the first 5 bytes of the function to a pre-allocated buffer in memory (a stub). The exact number of bytes to be copied to the stub may change depending on the instructions present at the head of the function. It cases where 5 bytes do not fall within instruction boundary, it is necessary to copy additional bytes until there is enough space for the JMP instruction to be inserted. Note that relative control-transfer instructions (i.e. JMPs and CALLs) need to be modified during copying to ensure that they transfer control to the right location in memory when executed from the stub. Obviously, performing such an analysis of assembly instructions requires the assistance of a disassembler, which, as I have mentioned before, is not very easy to implement.
If you are not intimidated by the complexity of this technique and wish to use it with your own applications, you might want to have a look at the source code of Detours - an API interception library, which was developed by one of the members in Microsoft';s research department.
Breaking address space barriers
By now, you should have a pretty good idea on how to implement an API interceptor capable of redirecting API calls to your own logging code. However, one problem remains unsolved - how do you ensure that the logging code is executed in the right address space? Before we can answer that question, we need to better understand the internal architecture of Windows memory management unit.
As you probably know, each 32-bit Windows application gets a unique address space to toy with. During a task switch, Windows updates its page tables to reflect the new process';s linear to physical memory mapping (also known as the process memory context). As a result, the page table entries that correspond to the private memory area of the process are modified to point to different physical addresses while those that correspond to shared memory regions remain untouched.
Under Windows 9x, the 4GB linear address space is divided into several distinct memory regions, each with its own predefined purpose. MS-DOS and the first portion of the 16-bit global heap occupy the lowest 4MB. The next region spans from 4MB to 2GB and it is where Windows 9x loads each process';s code, data and DLLs. Since each process physical addresses are unique, Windows 9x ensures that when a specific process is active, the page table entries corresponding to the 4MB-2GB region are mapped to this process';s physical memory pages. The idea is for all processes to share the same linear addresses but not the same physical addresses (which is obviously impossible). Think of it as if the 4MB-2GB linear memory region is a window into the physical address space. This window slides each time a process is scheduled for execution to provide a view of the unique physical memory locations occupied by that process.
So what';s the real benefit of this mechanism anyway? Well, since all processes use the same linear addresses, Windows can load them into different physical locations without performing any code fix-ups. This means that the memory representation of a process can be (almost) an exact copy of its on-disk image.
Continuing our exploration of the Windows 9x address space reveals that the region spanning from 2GB to 3GB is reserved for the upper portion of the 16-bit global heap. Also in this region are the memory-mapped files and Windows system DLLs (such as USER32, GDI32 and KERNEL32), which are shared among all running processes. This region is extremely useful for API interceptors, since it is visible in all the active address spaces. In fact, APISpy32 loads its spying DLL (APISpy9x.dll) into this area, thus ensuring that its logging code is accessible from any process issuing a call to an intercepted function.
Under Windows NT/2000, the story is a bit different. There is no documented way of loading a DLL into an area shared by all processes, thus the only way to ensure that the logging code is accessible by the target process is to inject the spying DLL into its address space. One of the ways to accomplish this is by adding the name of the DLL to the following registry key:
HKEY_LOCAL_MACHINE\Software\Microsoft\Windows NT\CurrentVersion\Windows\AppInit_Dlls
This key causes Windows to load your DLL into every address space in the system. Unfortunately, this technique can only be used to inject a DLL into processes that link with user32.dll, meaning that console applications, which do not usually link with this DLL, are not included. Other methods for injecting a DLL into a process';s address space are thoroughly described in the article "Load Your 32-bit DLL into Another Process';s Address Space Using INJLIB" by Jeffrey Richter, which was published on the May 1994 issue of MSJ. Additional
Injecting at the right moment
Knowing how to inject a piece of code into another process';s address space is one thing, but timing is also a crucial factor when implementing an API interceptor. Inject at the wrong moment and your interceptor might miss calls issued by the target application. This problem really comes to light when implementing a system-wide interceptor. In that case, the interceptor needs to inject its spying DLL into a process';s address space immediately after that process is executed, but right before it issues calls to intercepted functions. The best way to accomplish this is by monitoring calls to the CreateProcess function. When such a call is detected, the interceptor';s logging routine passes control to the original CreateProcess function with a modified dwCreationFlags parameter, which contains the CREATE_SUSPENDED value. This ensures that the target process is started, but placed in a suspended state. The interceptor can then inject its spying DLL into the target process';s address space and activate it using the ResumeThread API function.
Other ways for detecting the execution of processes under Windows are presented in the following section. Unfortunately, due to their asynchronous nature, they are less suited for detecting the appropriate time for injecting a spying DLL into another process';s address space.
Detecting Process Execution
As you saw earlier, it is often necessary for an application to be able to detect the execution of new processes. One possibility, which was previously mentioned, is to hook the CreateProcess function and monitor calls made to that function from different parts of the system. However, implementing a system-wide API interceptor for the sole purpose of hooking a single function often does not justify the effort.
Fortunately, there is a simpler way to accomplish that under Windows 95 (OSR 2 and above), 98 and NT/2000, without requiring the support of a full-fledged system-wide API interceptor. Under Windows 9x, it is possible for a virtual device driver to respond to the CREATE_PROCESS message sent by VWIN32 whenever a new process is executed. Windows NT/2000 offers similar functionality through the use of the undocumented PsSetCreateProcessNotifyRoutine function exported from NTOSKRNL. This function allows a device driver to register a callback function that receives notifications from the operating system whenever a new process is started. If you are well versed in device driver development, you should have no trouble deciphering these interfaces yourself using the following examples as your guide:

The Nerditorium column by Jim Finnegan, which was published on the January 1999 issue of MSJ, presented a utility that detects execution of processes under Windows NT/2000.
The ProcSpy32 utility available on my web site detects execution of processes under Windows 9x using a combination of a device driver and an OCX component.
Winsock Hooking
Many of the programmers that look into API hooking techniques are seeking a way for monitoring network activity performed by Winsock applications. Such functionality is often required by anti-virus utilities, personal firewalls and Internet content blocking applications (e.g. CyberPatrol).
If you require such functionality in your application, you need not write a system-wide API interceptor, but rather use a mechanism, which was introduced along with Winsock 2. This mechanism is thoroughly described in the article Unraveling the Mysteries of Writing a Winsock 2 Layered Service Provider, which appeared in the May 99 issue of MSJ.
An alternative technique for monitoring network activity under Windows relies on the way NDIS drivers are layered on top of each other. By writing an intermediate driver (or alternatively by hooking NDIS interfaces), it is possible to monitor not only TCP/IP communication, but also any other data transferred through the network adapter. Take a look at the following resources for more information about these techniques:

VToolsD available from NuMega technologies comes bundled with a sample driver (HookTDI), which demonstrates a way for monitoring network activity under Windows 9x.
IMSAMP is a sample NDIS intermediate driver, available on Microsoft';s web site.
DDE and BHO (a.k.a. Browser Helper Object)
In cases where Winsock does not provide sufficient information about the underlying network activity, a programmer can use two additional services, which are tailored specifically for the purpose of monitoring Internet browsers such as Netscape Navigator and Internet Explorer. Through these interfaces it is possible not only to monitor the data transferred through the network, but also to track the browser window, which initiated the network transaction.
GDI Hooking
I often get asked about ways for monitoring graphics operations under Windows. Up until now, there was no documented way to monitor these calls unless you were willing to create a system-wide API interceptor that hooks every GDI function in existence. Obviously, this is far from being an easy solution. Fortunately, the new accessibility API available under Windows 9x introduces a mechanism that allows applications to monitor graphics operations before they reach the video driver. Windows 2000 offers similar functionality through a slightly different interface.
Interrupt Hooking
In the old days of DOS, interrupt hooking was widely used by TSR (Terminate and Stay Resident) utilities and other applications to extend the operating system functionality and monitor its behavior. Under Windows, interrupts still play a major role, and are mainly used as a portal that connects user-mode (a.k.a. ring 3) code to the operating system';s kernel. If you wish to hook interrupts under Windows, you should have a look at the following resources:
Undocumented Windows Nt by Prasad Dabak, Sandeep Phadke and Milind Borate.
Monitoring NT Debug Services by Jose Flores, Windows Developer Journal, February 2000.
NTSpy - A utility that monitors NT';s system calls by hooking interrupt 2Eh.
COM Hooking
API spying applications are great for monitoring Windows APIs, but their lack of support for COM interfaces makes them useless when trying to monitor OCXs and other OLE components. Fortunately, there is a way for monitoring COM interfaces under Windows. This technique was presented in the article Spying on COM Objects by Dmitri Leman, which was published on the July 1999 issue of WDJ.
16-bit API Interception
Interception of 16-bit code is not common anymore, but some programmers still require such capabilities in their applications. If you are unfortunate enough to be still working on 16-bit code, you might want to have a look at the article: "Hook and Monitor Any 16-bit Windows Function With Our ProcHook DLL" by James Finnegan, which was published on the January 1994 issue of MSJ.
NT System Calls
If you have ever examined ntdll.dll with QuickView, you might have noticed that it exports a set of functions that begin with the Nt prefix. These functions are actually small stubs of code that pass control to the Windows NT kernel (NTOSKRNL) using interrupt 2E. Many of the functions exported from kernel32.dll are nothing more than control transfer routines to the stubs located in ntdll. For example, when a Windows application issues a call to CreateFile located in kernel32.dll, the call is redirected to NtCreateFile, which passes it on to NT';s kernel for further processing. The special design of this mechanism allows a device driver to hook these interfaces, thus providing a way for monitoring activities performed by Windows NT/2000 applications. A thorough description of this mechanism is presented in the following resources:
Undocumented Windows Nt by Prasad Dabak, Sandeep Phadke and Milind Borate.
Windows NT/2000 Native API Reference by Gary Nebbet.
Regmon - A utility that monitors access to the registry by using system call hooking techniques.
NTSpy - A utility that monitors NT';s system calls by hooking interrupt 2Eh.
Tracing NT Kernel-Mode Calls by Dmitri Leman, WDJ, April 2000.
Resources
Undocumented Windows Nt by Prasad Dabak, Sandeep Phadke and Milind Borate.
Windows NT/2000 Native API Reference by Gary Nebbet.
Advanced Windows (3rd Ed) by Jeffrey Richter.
Inside the Windows 95 File System by Stan Mitchell.
Windows NT File System Internals by Rajeev Nagar.
Windows NT Device Driver Development by Peter Viscarola and Anthony Mason.
Developing Windows NT Device Drivers by Edward N. Dekker and Joseph M. Newcomer.
Writing Windows Wdm Device Drivers by Chris Cant.
Programming the Microsoft Windows Driver Model by Walter Oney and Forrest Foltz.
Windows Undocumented File Formats by Pete Davis and Mike Wallace.
Windows 95 System Programming Secrets" by Matt Pietrek.

[翻译]API Spying Techniques for Windows 9x, NT and 2000

           Windows9x,NT和2000下的API侦测
                           ------------Yariv Kaplan
API 侦测是探索应用程序和操作系统内部结构的最为强大的工具。不幸的是,无论是SDK还是DDK都没有提供任何文档或者例子来展示一条实现此功能的方法。本文将试图通过展示一些在Windows应用程序中的API钩子来揭示该课题的一些内容。
有几个问题在你坐下来写一个API侦察工具之前必须要考虑到。首先,你需要决定是否只是侦察单独的一个应用程序还是安装一个系统范围内的全局拦截钩子。在不同的情况下,这两种方式都会有其作用。比如你需要写一个来阻塞一些符合由管理者设置的规则的进程执行的程序。明显地,你需要一个方法来严格监视所有新进程的创建和结束。一个能够实现该功能的方法是安装一个系统全局钩子来监视CreateProcess函数的调用(事实上,对于Ansi和Unicode版本的CreateProcess都要监视)。不管什么时候一个应用程序内部调用这些函数中的一个来创建一个进程,你的拦截器将获得控制权并执行一些必要的指令。其他类型的应用程序可能需要一种更简单的API拦截器,他们只需要在某个时刻监视某一个应用程序。一个好的例子就是由NuMega出品的BoundsChecker--一个有通过分析应用程序中的API调用来发现内存泄露以及其他潜在的漏洞的程序。
不管你是打算采用系统全局方案还是更简单的方法,你都必须从几种API钩子技术中选择一个。接下来的几节我将探究一些可行的方法来实现一个Windows下的API拦截器,而重点将放在每种技术的优点和缺点之上。
替换动态链接库
这是迄今为止Windows系统下最早的API钩子的实现方法。作为使用该方法的一个实例,想想一个杀毒程序要对收到的电子邮件进行病毒扫描,一个显而易见的前提条件是需要这样一个程序,他能拦截Winsock的I/O函数从而来分析从远程邮件服务器到本地邮件客户端之间传输的数据。
这项功能可以通过创建一个替换动态链接库来简单完成,该动态链接库需要为每一个从Winsock库中导出的函数做一个存根。如果这个替换的动态链接库的名字和原始的Winsock库的名字(wsock.dll)是一样的,并且被放在跟目标邮件应用程序相同的目录下面,这时拦截就会自动发生。当目标应用程序试图加载原始的Winsock库的时候,该替换动态链接库将会取代原始Winsock库而被加载。所有的Winsock函数的调用将被传递给这些由替换动态库导出的存根中去。在做完了一些必要的处理之后,替换动态库在简单的把这些调用传递给原始Winsock库并且控制返回值给调用程序。
恐怕你会担心上面所描述的拦截API的方法过于简单,实际上还是有一个问题,尽管他可以简单实现,这个方法有一个主要的缺点是:拦截一个导出200个函数的动态库导出的其中一个函数将需要在替换动态库为这200个函数都创建一个存根。这将是一个十分冗长而乏味的过程,并且往往不可能完成,因为其中许多函数都是未文档化的。
如果你想看一个利用此方法的例子,请参考Matt Pietrek的WinInet程序。他被发表在94年12月份的MSJ(MSDN)上。

修改调用
在思考拦截API调用的途径的时候,有2个位置你可以去干预:一个是调用的源头(应用程序代码),一个是目的地(目标函数)。这里所说的修改调用的方法依赖于第一种思路。
对于每个你想拦截的API,你要修改目标程序中所有的包含调用该API的地方的代码。既可以在磁盘中修改(可执行文件自身)也可以在内存(可执行文件被装载之后)中修改。最痛苦的部分是需要精确地指出哪些地方必须被修改。为了达到这个目标,你需要完成一个能分析汇编指令的反汇编程序。明显地,写一个反汇编器与一个价值并不太高的任务相差太远了。所以把这种拦截API的技术归结为众多方法最不受欢迎的一种。
修改导入地址表
如果你曾经深入研究过API钩子,你大概听说过修改导入地址表这种方法。这种方法因为其诸多优点成为Windows下首先被考虑的通用的拦截API的方法。
这种方法所依赖的基础是32位Windows可执行文件和动态库都是以PE文件格式来创建的。以这种格式建立的文件大多由几个逻辑块(又称之为节)组成。每个节包含特定形式的内容。比如.text节保存有应用程序的所有汇编代码,而.rsrc节则充当着一个仓库,其中贮存着应用程序的所要用的一些资源,象对话框,位图和工具栏等。
在一个Windows执行文件中的所有的节中,idata节对于想实现API拦截的人来说是十分有用的。该节中的一个特定的表(又称之为导入地址表)中存放着在执行代码中用到的函数名及其偏移地址。不管Windows何时加载该执行文件,他都将把这些偏移地址修改为当前的导入函数的地址。
为什么windows在修改地址导入表的时候陷入麻烦呢?正如你所知道的,windows的可执行文件和动态链接库在被装载进内存后经常需要重新定位(这取决于他们的位置是否发生冲突)。如果想在可执行文件中提前设置好加载地址似乎是不可能的。为了使这写调用能够成功到达他们的目的地,Windows不得不在一个执行文件的映象被加载到内存后,单独去为每个调用查找和修改导入表。这样一个在一个新进程和动态库的初始化过程中的大数量的处理过程将会降低系统的速度,从而给用户的感觉是:Windows是一个缓慢和反应迟钝的操作系统:-)
这一切对于API侦察有什么作用呢?windows的导入地址表的重定位机制为拦截API提供了一个完美的方法。通过以一些跳转指令来覆盖一个指定导入地址表的入口地址,API拦截器就能在原始函数有机会被执行之前获得程序控制权。
显而易见,这种实现方法也有一些棘手的问题。比如如何实现在内存中的跳转代码,被拦截程序的上下文的问题。下面这些文章中对这些问题进行了讨论:
Matt Pietrek的一本非常好的书《Windows 95 System Programming Secrets》中包含了一个称为APISpy32的拦截API程序的源代码。其功能首次是作为Matt的一个文章的一部分被发表在1995年12月份的MSJ(MSDN)上。APISpy32的一个新的版本可以通过访问Matt的网站http://www.wheaty.net来获取。
直接修改API
这是我最喜欢的拦截API的方式,他有一个与生俱来的优点是能够在只修改一个位置的情况下跟踪不同地方对该API的调用。这个地方就是这个API自身的位置。
有许多方法可以被用在这里。其中有一条就是把目标API的第一个字节的指令替换成一个调试断点指令(Int 3).任何对该函数的调用都将产生一个意外,他将在那种情况发生时告诉你的API拦截器在他把你的拦截器当作一个调试程序的时候。不幸的是,这种方法存在许多问题。首先,糟糕的Windows意外处理机制将严重地降低系统的速度。第二个问题是执行windows的调试API。一旦一个调试程序关闭,他将会结束他控制下的所有程序。明显地,这样一个举动对于你将要实现的一个全局钩子是不利的,因为他必须在目标进程结束之前结束自己。
另外一个可能是可以通过cpu的控制跳转指令(CALL或者JMP)来修改函数。同样,这种方法也存在着许多问题。首先,修改可能会超过拦截函数的地址。在目标API比5字节还短的情况下,这种情况将会发生。另外一个可能的问题是需要经常转换已经被修改的函数到尚未修改的函数。这就意味着一旦你的跳转指令从cpu获取控制权,他就必须把这个被拦截的函数恢复到原来的状态。这就要求允许拦截器把调用发送给原始的未被拦截的函数。注意到在这段时间内,cpu在执行原始的函数,而系统其他的部分可能会对这个函数传递一个调用。既然这个函数目前处于未被拦截的状态,也就是原始状态,API拦截器将会错过这些调用。一个强大的API拦截器可能利用一个更好的方法来解决这个问题。通过下一章的阅读,你将知道一个更好的方法使得他变得更加成熟与完善。
进程的拦截
在这种情况下,API拦截器放置了一个JMP指令在目标函数开始的地方,但是在这之前他并没有将这开始的5个字节的指令保存到缓冲区。这5个被拷贝的字节可能会改变对函数开始5字节指令的依赖。他形象指出5字节并不在指令区间里面。拷贝那些额外的字节直到有足够的空间来让JMP指令插入是很有必要的。注意到那些控制转移指令(JMPs和CALLs)需要在被拷贝的时候修改以确保他们能跳转到内存里正确的位置。明显地,完成这样一个对汇编指令的分析需要一个反汇编器,先前我已经提到过依次。这不是很容易就能实现的。
如果你并不为这项技术的难度所压倒并且想在你自己的程序中使用该技术的话,你也许可以参考一下Detours 的源代码。这是一个拦截库,他是由微软研发部门的一个员工开发的.
清除地址空间的障碍
到现在为止,你应该已经有一个很好的方法来实现一个能够重定向API到你自己的拦截代码的API拦截器了。尽管如此,我们的问题仍然没有解决:怎么去保证拦截代码是在正确的地址空间被执行呢?在我们回答这个问题之前,我们先要对windows内存管理单元的内在结构体系作一个更好的了解。
你大概知道每个32位windows应用程序在任务分配的时候都有一个单独的地址空间,windows更新他的页表来反映新进程的线形地址与物理内存之间的映射。(也称之为进程内存上下文)作为其结果,与进程的私有空间相关联的页表区域被修改指向不同的物理地址,尽管这些被共享的内存仍然未被改变。
在Windows 9x中, 4GB线性存储器空间被区分为一些不同的存储区域,每个区域是由系统自己的预先定义的目的。MS-DOS 和 16 位全局堆占据着最低的4 MB空间。下一个区域则是从4MB到2GB之间的跨越空间,在那里Windows9x加载着每个进程的代码,数据和动态库。既然每个进程自身的物理地址是唯一的。Windows 9x就确保当一个指定的程序活跃的时候,其4MB-2GB之间的地址区域的分页表将被映射到程序的物理存储器。注意到所有的程序共享相同的线性位址但不共享相同的物理内存地址( 这是明显不可能的)。就好像4MB-2GB的线性存储器区域是进入物理存储器空间之内一个窗囗。这一个窗囗滑动一次一个被预定的进程就提供一次被自己占据的物理内存的视图。
不管这一机制的真正好处是什么? 既然所有的程序使用相同的线性内存地址,Windows就能在不需要进行任何的代码码固定的情况下就把他们载入到不同的物理位置。这就意味着程序在内存中的表示方法可能就是(几乎就是)它在磁盘上的一个映象的拷贝。
继续上面我们对Windows 9x内存空间的探究。现在展示从2GB到3GB这段被保留的区域是作为上面的16位全局堆的一部分。同样这一区域也是内存映射文件和 Windows系统动态链接库(比如USER32,GDI32和KERNEL32),他们都被所有的正在运行中的程序共享。这一个区域对API拦截的人是很有用的,因为它可以被所有活动的进程所看到。事实上,APISpy32 把他的侦察DLL(APISpy9x.dll)加载到这一区域,从而确保他的跳转代码可以很简单的完成从被拦截函数的调用跳转到他的拦截函数。
在Windows NT/2000下,这个故事就有点不同了。没有任何文档说明可以把一个动态链接库加载到一个被所有进程共享的位置,因此唯一的确保使跳转代码比较接近进程地址空间的方法是把拦截动态库注入到进程的地址空间中去。其中有一个方法能达到此目的,把动态链接库的名字添加到下面的注册表路径中:HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\NT\CurrentVersion\Windows\AppInit_Dlls 。这个键值将导致windows把你的动态链接库加载到每个进程的地址空间中去。不幸的是,这个技术只能被注入到图形界面程序中去,对于控制台程序则不奏效。其他的注入一个动态链接库到一个进程地址空间的方法都被在Jeffrey Richter的文章《Load Your 32-bit DLL into Another Process';s Address Space Using INJLIB》中彻底描述了。该文章被发表在1994年5月份的《MSJ》杂志上。
在正确的时机注入动态库
知道了如何注入一小段代码到另外一个进程的地址空间是一件事情。但是在我们完成一个API拦截器的时候,时机的把握也是一个至关紧要的事实。如果在错误的时机注入,我们的拦截器可能错过很多目标进程的调用。尤其是在实现一个系统全局钩子的时候这个问题将更加明显。在这种情况下,拦截器需要在目标进程开始执行之后并且在其调用目标函数之前立即把他的拦截动态链接库注入其进程地址空间。完成该功能最好的方法是监视对CreateProcess函数的调用。每当捕获到这样一个调用,拦截器的拦截代码应该向原始CreateProcess函数传递一个修改过的dwCreationFlags参数,其值设置为CREATE_SUSPENDED。这就确保了目标进程被启动了,但是处于暂停的状态。拦截器就能把他的拦截动态链接库注入到其地址空间中去,然后用 ResumeThread这个函数来恢复其运行状态。
其他的在windows下对进程执行的探究将在接下来的小节里讲述。不幸的是,由于他们不同的特性,他们很少能适合寻找一个恰当时机来把一个拦截动态链接库注入到其他进程地址空间中去。
探究进程的执行
正如你前面所看到的,一个进程非常有必要去发现一个新进程的执行。一个可能是先前已经被提到过的拦截CreateProcess函数,然后监视系统不同部分对该函数的调用。尽管如此,实现一个系统全局的拦截器只是为了单独拦截这一个函数是非常不恰当的。
幸运的是,在Windows 95(OSR 2 and above), 98和NT/2000下有一个更简单的不需要全局API拦截器的方法来实现此功能。在Windows 9x下对于一个设备驱动程序来说当一个进程被执行时,他是有可能回复由于新进程而导致VWIN32发送的CREATE_PROCESS消息的。Windows NT/2000则提供了一个功能更为类似的未文档化的函数PsSetCreateProcessNotifyRoutine,该函数是由NTOSKRNL导出的。这个函数允许一个设备驱动注册一个回调函数来接受当一个新进程启动时操作系统所发送的该表的消息。如果你精通于设备驱动开发,你应该对于使用下面这些例子来使自己明白没有任何问题:
Jim Finnegan写的《 Nerditorium column》被发表在1999年1月份的《MSJ》上,他展示了一个对于Windows NT/2000下对进程执行的探究。我自己的web站点上的ProcSpy32工具则描述了Windows 9x下通过OCX和设备驱动来实现的对进程执行的监控。
Winsock 钩子
许多程序员深入研究API钩子是为了寻找一种方法来监视由Winsock扮演的网络行为。这种功能经常被反病毒软件,个人防火墙以及网络数据过滤程序等采用。
如果你在你的程序中需要实现这样的功能,你需要写一个系统全局API拦截器,而且最好使用一种单独为Winsock2介绍的机制。这种机制已经被《Unraveling the Mysteries of Writing a Winsock 2 Layered Service Provider》这篇文章彻底描述了,他被发表在99年5月份的《MSJ》杂志上。
一个折中的方法来实现在windows下对网络行为的监控依赖于NDIS驱动。他位于所有方法的最顶端。通过写一个中间层驱动,他不仅能监控TCP/IP的数据通信,而且还能监控其他的通过网卡的所有数据流。下面列出一些关于这项技术的更多资料:
由NuMega科技公司出品的VToolsD,提供了一个驱动例子(Hook TDI),他示范了在Windows 9x下监控网络行为的一种方法。微软的站点也提供了一个NDIS中间层驱动的例子NuMega。
动态数据交换和浏览器帮助对象
在winsock不提供充分的关于底层网络行为信息的情况下,程序员可以使用另外两种额外的服务,他们是专门为监控网络浏览器比如网景,IE等程序而专门制作的。这些接口不仅能监控通过网络传输的数据,而且还能跟踪浏览器初始化网络传输的窗口。
图形设备钩子
我经常询问在windows下来监控图形操作的方法。到目前为止,没有任何文档化的方法来监控这些调用,除非你打算创建一个系统全局钩子来拦截每个存在的图形函数。明显地,这是一个到目前为止最简单的解决方法。幸运的是,windows 9x介绍了一个新的体制,他允许应用程序在图形数据未到达视频驱动之间对他们进行监控。windows 2000通过一个更小的接口也提供了一个类似功能的函数。
中断钩子
在DOS的时代里,中断钩子被广泛应用于内存驻留程序和其他一些扩展操作系统功能以及监视自己行为的程序。在windows下,中断依然扮演着一个重要的角色,主要在从用户模式转换为内核模式的中断过程中。
COM钩子
API拦截程序对于监视Windows的API是十分有效的,但是他们缺少对COM接口的支持使得他们在试图监控OCXs和其他OLE组件的时候变得无效。幸运的是,在windows下有一个方法可以监控COM接口。这个方法是由Dmitri Leman在他的文章《Spying on COM Objects》中首先提出的。该文章被发表在99年7月份的《WDJ》杂志上。
16位API的拦截
拦截16位程序的代码已经不再流行了,但是一些程序远依然需要在他们的程序中使用该技术。如果你不幸运地依然工作在16位平台上,你也许可以看看这篇文章:《利用ProcHook拦截和监控任意16位函数》,作者是 James Finnegan,该文章已经被发表在1994年1月份的《MSJ》杂志上。
NT系统调用
如果你曾经使用QuickView检查了ntdll.dll的话,你也许会注意到其导出了一些以NT打头的函数。这些函数实际上都是使用0x2E中断转发的Windows NT(NTOSKRNL)中的函数。许多从kernel32.dll中导出的函数实际上都是从ntdll.dll中转发的。比如当一个应用程序调用位于kernel32.dll中的CreateFile函数的时候,这个调用会重定向到NTCreateFile,而它则被中断传递到NT的内核做更多的处理。这种特别的体制的设计允许一个设备驱动来拦截这些接口,因此就提供了一个方法来监控Windows NT/2000下应用程序的行为。

TOP

返回列表 回复 发帖