I’m considering making a tool for tracking/auditing user logons, as this is something people often ask me to make, but there’s one thing I want to get your opinions on first.
Recently I’ve been asked by quite a few people if I have any tools that can report on which computers a user is logged on to and that would provide an audit trail of when they have logged on (not just the last logon time). I’ve responded to most of these requests by asking the question that I’m about to ask here, but I very rarely get a reply so I figured I would post it here and see if I get any more feedback.
Basically the question is about how the tool should collect the logon information (e.g the date/time a user logged on and which computer they logged on to). As far as I can see there are 2 practical ways of doing this, I will describe each below:
Method 1: DC Event Logs
If you enable logon auditing on your domain, every time a user authenticates with a DC then an event log entry will be written to that DC’s security event log. So one way of tracking user logons would be to have a central service running on a server that periodically checks the event logs of each DC and parses the relevant log entries to get the logon information from them and store it in its own database. However, there are several problems with this method in my opinion and that’s what led me to thinking of the second method. Some of the problems I can see are:
- This won’t catch any offline logons. Users logging on to a computer that is not currently connected to the network at the moment they log on will not authenticate with a DC so there will be no log entry for them. You might think that’s not such a big deal as all of your users connect to the company network, but there are an awful lot of VPN solutions in use for remote users now that require the user to log on first before they connect the VPN so none of these logons would be captured.
- Event log entries are not designed to be parsed. From my tests and research so far I’ve not actually been able to find a good single event log entry that gets logged on the DC when a user logs on to a PC that actually has the user’s name and the computer’s name in it. There seem to be several entries that get logged during logon: some from the kerberos system that give good information but I’m not sure they would be logged in all cases as I don’t think kerberos is always used for logons in every domain (also it might be hard to differentiate the entries from user computer logons from other services that use kerberos), some general authentication entries that only give you the IP address of the computer the user is logging on to, some entries that also get logged every time the user does a network logon to access any file servers, etc etc. The format of these event log entries (and even their event IDs) also change between different versions of Windows Server. It can’t be impossible to successfully parse these log entries to get the required information because I know some other tools do it but it is certainly not easy and that means more chance of problems and reliability issues. EDIT: Turns out the tool I was thinking of might have the same problem of struggling to get workstation logons purely from the DC event logs as well, as it only seems to be showing me logons from the DC itself and not the workstations.
- Requires periodic polling. The central service for the tool would need to poll the event logs on each DC at a configured interval. This means you would either have to set it to a very short interval so that you get up to date information whenever you use the tool and hope there’s no noticeable performance impact from this, or you would have to set it to just poll once a day or something and accept that your reports won’t ever be completely up to date. Personally I don’t like either of these options, especially as in a large domain with lots of DCs this would either take a long time (especially for DCs separated from the central service by slow WAN links) or would require an agent to be installed on some DCs.
Method 2: Workstation Agent
So to avoid all of the problems mentioned above, I thought of having a service that runs on each computer that can log whenever a user account logs on to that PC and then send this information to a central server instantly (or the next time the user is connected to the company network). I’ve already written a basic proof of concept service that logs all of the required information successfully every time someone logs on. This also has the benefit of being able to capture logoffs and locks/unlocks of the PC, and it can capture local account logons, not just domain accounts.
Of course there are disadvantages to this method as well, even though it gets around each of the problems mentioned above with the DC event log method. The main problem with this workstation agent method is the actual installation of the agent on every machine – I’m conscious that some organisations might not like the idea of having to push out extra software on to each of their machines, which is why I’m looking for feedback on how much of an issue this would be for people. The agent installation would be quick and easy, and you would be able to do it remotely by just selecting an OU from AD to push it out to all of the computers in there or you could manually enter a list of computer names to have it pushed out to. There would be no manual configuration required on each machine or anything like that.
What do you think?
So if you were looking at a system like this that let you audit logons and run queries and reports to see who is logging on at what times and which computers etc, what would your preference be? Would you consider using a tool that required an agent to be pushed out to each workstation or would that be a deal breaker for you?
EDIT: To clarify – I’m not asking which you would prefer out of the 2 options, as I know everyone would prefer just one central service that collects info from DCs if that could be just as reliable and useful as the other option. Unfortunately that’s not really the choice. If no one would use a system that required an agent pushing out to each workstation then I probably just won’t make this tool at all, because the DC event log parsing method has too many issues that would make it painful for me to develop and would make the system not particularly useful or reliable.
There are other products like Quest AD Change Auditor that actually use an agent on the DC. If you put a agent on the DC that watches for changes you wouldn’t have to watch event logs, turn up logging, poll periodically I wouldn’t care to much about offline logins since it would have been recorded once already that they have logged into that machine. We perform similar logging but is easily done with Splunk logging AD logs. Quest AD Change Auditor is a much better high end solution for watching for changes.
Thanks for the quick response 🙂 I’m not sure if you’re confusing logon auditing and AD change auditing though – they’re 2 different tasks that require different solutions to audit them. Like you say, you can have an agent that watches for changes in AD and then you don’t have to poll event logs, but you can’t do the same with logons. AD has functionality built in to it for notifying a program when something changes – there is no such thing for logons apart from on each computer rather than on the DCs (hence the workstation agent idea). I’m only talking about logon auditing here, not AD auditing.
Chris, I’m not a big fan of agents on machines (as you already guessed many people aren’t). But my suggestion was going to be similar to Trevor’s – design it as an agent on the DC. I know nothing about what processes to intercept / watch for / etc, but I know this is the same approach Google takes with their Google Apps Password Sync (GAPS) tool. It intercepts some processes that happen on the DCs every time a password is changed, gather the info, and then Google uses it to sync passwords from your domain into Google Apps. I know the logging around authentications is terrible, so watching the logs is almost useless. Netwrix has a Logon Reporter tool I’ve used in the past that does a decent job looking through the logs, but if there were a system that ran off agents on the DCs that would continually parse the information into a useful database (one where I could set triggers to email me if certain things happened), that would be a nice product to have.
Thanks for the feedback. The problem is that there is absolutely no way for an agent to be notified of logons. For password sync there are built in methods in AD for intercepting password changes as they happen, and other such changes have similar support for notification by providing an API that lets you register your program to be notified after something happens etc. As far as I know there is no such system for workstation logons because they are not actually happening on the DC really, the DC is just authenticating the username and password combo and the only place it passes that information is to the event logs which like you say is pretty terrible to look at or to try parse. I’m pretty sure the Netwrix tool just looks at the event logs, which is why I said it must be possible to parse them with some success but it still seems pretty tricky and potentially unreliable to me. I’ll keep looking and see if I can find any way to intercept AD authentication requests on the DC and inspect them, but I suspect there isn’t 😦
I found this article from SANS that gives pretty good info on logon processes, but you’ve probably figured at least this much out already. FWIW: http://www.sans.org/reading_room/whitepapers/forensics/windows-logon-forensics_34132
Cool I’ll take a look thanks. But yeah I’ve spent the last few hours trying to find any possible ways of getting a notification (or any kind of involvement) with the authentication process on a domain controller but it looks pretty much impossible. The only method I found that was even vaguely close was some rather complicated C++ code that someone had made that basically replaced the calls to the normal authentication packages (NTLM or Kerberos) with itself and then called the real authentication packages DLLs after doing whatever kind of logging or action you wanted to do with the information about the user that was being authenticated. I don’t want to go modifying fundamental authentication behaviour on domain controllers with my own software though, that’s just asking for problems and would definitely not be supported by Microsoft.
Just had a read through that PDF and it does help clarify some of the event log entries but I’m still unable to find an event log entry that provides both the username and workstation name and an indication that this is a normal interactive logon (e.g at the Ctrl Alt Del screen) rather than an authentication logon that happens all the time for accessing network drives and databases etc. Annoyingly the kerberos related events only give you the workstation’s IP address and not its name, apart from in some of them where the workstation name is provided as the “service name” but the problem is there are lots of other instances where the service name is the name of the DC authenticating the user or is just the built in kerberos user account. I really don’t understand why MS couldn’t just provide a simple audit entry that simply says user X logged on to computer Y.
I’ve done some testing with the Netwrix tool for tracking/auditing logons and it looks like it only actually gets logons that happened directly on the DCs by default. I’ve been testing logging on to several workstations and none of them show in the Netwrix reports, the only logons that show are when I actually logged on to the DC itself. There’s an option to collect events from workstations as well (which is not enabled by default) but presumably that means it will directly query the event logs on each workstation. At least its not just me that finds it difficult to parse the DC event logs successfully to get workstation log ons then… I’ll contact Netwrix and see what they say though, perhaps I’m missing an option somewhere.
OK I’ve found a method that MIGHT let me get my own DLL involved in the authentication process, but it is going to require a lot of marshalling between .NET and native C/C++ which is always a challenge and also there’s some debate over whether or not it will work for interactive logons or if it only works for network authentication logons that occur when a user accesses file shares etc. Unfortunately it is going to take an awful lot of work and testing to even find out if it will actually work for what I want to do, but I’ll give it a go.
UPDATE: I’ve managed to get the kerberos authentication package to call my DLL but even though all my DLL does is write the time to a text file and then tell the kerberos package to continue I seem to have locked myself out of my entire test domain now haha logging on to the DC or any workstations just says “username or password incorrect”. This is why you don’t try this kind of thing in a production environment 🙂
FURTHER UPDATE: OK I don’t think this method of intercepting the authentication on each DC is going to work. I resolved the issue that was causing the logons to fail, so my DLL is getting called and returning successfully now but actually getting any information out of the parameters passed in to the DLL is proving very difficult. I’ve been at it for the last 4 hours and not managed to even get the username or workstation name out of it, and half the time it crashes the DC during boot up (that’s another bad thing about this method – it requires a reboot before Windows picks up on the fact that it should call the new DLL). As much as I hate admitting defeat, I think in this case it is just too awkward to make a .NET DLL work with a function that is only meant to be used by native unmanaged code (i.e. not .NET). So I guess it is back to the DC event log parsing or workstation agent methods – I know I could try to provide both options, but then I’m pretty sure everyone would only use the DC event log parsing option so if that’s the case it seems pointless spending all the time and effort on making a workstation agent (and server side component for communication)
– Modifying DC’s is going to be a stretch for any network admin to do without resistance.
– Intercepting event logs would be ideal but sounds like it will be difficult.
– Client Agent is not desireable but we do do it from time to time. I deployed a similar solution a few years back, where it initiated a password sync prompt for our permanent VPN users, so that they didnt get locked out of their machines. Win7 fixed this. If it is a client solution i would suggest the following:
* group policy configuration of clients (and supply ADMX, not just ADM)
* MSI install so you can easily deploy via GPO
* Client should try and connect to network at startup, otherwise try every X seconds when an active network is detected.
* Client could have other functionality aswell – one in mind is similar to this startup script i wrote: (http://ivan.dretvic.com/2012/10/automatically-generate-description-field-for-computers-in-active-directory/)
All-in-all, i need to wonder why this is soo requested. I can imagine it would be handy to have, but you are now talking about a client/server model system that takes more thought to integrate, is limited to Windows machines only and its dealing with audit trails which in my opinion needs to be verifyable that the source data/storage data integrity is there, and has not been unoficially modifed etc.
I would go down two paths only for this system. Firstly i would build a client based service that builds on and has the power to do many things, this being the first. Secondly i would build an app that interigates exisitng AD DC’s for the data as needed. You make great tools for SysAdmins to use. This is more of a system and im not sure its one worthy of building.
Just had an afterthought… Does ADFS give you access to this information any easier?
I know exactly what you’re saying – it does start to sound like it is more hassle than it is worth just to track logons. I agree the DC agent that intercepts authentication requests is a non-starter, and the DC event log parsing also seems to be dead as I spoke to Netwrix (the makers of the only other logon reporting tool I’m aware of) and they said that the only way their tool can get interactive logon audits from workstations is if you set the tool to query the workstation event logs directly as well as the DC event logs. I don’t want to make a tool that attempts to remotely query event logs on every workstation periodically as there’s going to be so many machines not reachable at that time and lots of performance issues on remote machines etc.
Funnily enough I had actually already made a list of features/requirements for the tool a couple of days ago and it looked very similar to your list 🙂 One thing I didn’t have down was the need for ADMX files instead of ADM files though – why do you want ADMX files specifically?
Like you I thought of perhaps making this into a general purpose Cjwdev agent that could be used for more than just logon tracking (as then it makes client/server/database architecture more worth while). Again you read my mind as I was already thinking of one option being for it to update the computer’s description in AD with various details about the computer or user currently logging on. Another use could be for the software inventory system I’m going to be developing soon, as ideally that would need something on each PC rather than periodically attempting to query the registry remotely (again because machines would be missed if they weren’t online during the polling schedule and also because firewalls and disabling of the remote registry service might make remote collection impossible).
I don’t mind it just being limited to Windows machines, as all of my tools only target Windows.
I’ve not looked into ADFS much to be honest, but from my understanding of it I don’t think it would help in this scenario. I’ll do some research though.
A few years ago I wrote logon/logoff scripts that write to a database every time someone logs on and off. It records time stamp, user name, department, logon or logoff, host name, etc. This works pretty well for any PC that can reach the database but for laptops or older/slower PCs this doesn’t work at all.
I’ve been thinking about modifying the script to log the information to a file then write it to a database if/when it is available. This would allow the script to work on laptops and collect historic data when they eventually reconnect to the network.
I think a local service on the workstation would work for this since the main limiting factor is connectivity to the database. Perhaps scripts could write the info to a file and a service could be responsible for getting it to a server.
If you are interested in the scripts or the app that pulls the info from the database you are welcome to them.
Thanks David, and yeah that’s pretty much exactly how this agent would function on the workstations. It would sit running in the background all the time and when someone logs on it would try to send the relevant information to the central service but if that failed then it would just log it to a local file and re try after 30 seconds. If that failed again then it would wait until either the machine is rebooted or the IP address changes and then it would try again. The problem with doing this from a logon script is that offline users won’t run the logon script because they can’t find the script as it would be stored on a network share such as the Netlogon share or the GPO’s sysvol share (although I’ve seen people say the files for logon scripts are meant to be cached for offline logons, I’ve never seen that work properly myself)
Good point on on the logon scrips; even when the do run they don’t always work right.
Either way I would gladly use ‘Method 2’ above since this would be the most reliable way to collect the information. Let me know if you need a beta tester.
I was just wondering if you were still working on this project as it sounds exactly like the type of system I am looking for. Would I be able to run reports that can tell me a break down of user time logged on per day/week/month and number of times logged in/out per day/week/month as this is what I am needing to report on. In my organisation we have an average age of employee at around 16-19, and they do like to take as many liberties as they can.
I’ve put this on the back burner for now as a lot of people are put off by the requirement of having to install an agent on each machine. I’ll hopefully re visit it in the future though once I’ve completed a few other projects
You have my full (moral :D) support for keeping on with this project. I believe having an agent is vital for this to work accurately, since we also need to capture logons and logoffs when the computer is not connected to the domain. It would basically be what David Holland has already mentioned above.
There is a software named UserLock, from IS Decisions, which does that with an agent installed on the machine. I have actually found your blog post looking for a replacement to UserLock…
UserLock keeps a database on a server and agents deployed on the workstations. Every time a user logs on or off (and also when the workstation is locked/unlocked), the agent tries to contact the central database in order to register the event. If it’s not accessible, it keeps the events on the local computer until it’s back online.
The agent also handles cases where the workstation has been shut down without a proper logoff – if this happens, then it considers the time of the logoff as being the same as the time of the shutdown event. This is another reason why an agent is needed – a logoff script would not run if the workstation is shut down without first logging off.
Please let me know if you want to know deeper how UserLock works.
Thanks for the feedback – I found UserLock as well whilst researching existing tools that do this kind of thing. Perhaps you could email me (email@example.com) with some details of why you are looking for a replacement and what you would expect my tool to do that UserLock doesn’t already do?