Defense

Whilst conducting security testing and assurance activities, I went looking to show logon events in Office 365. My first query was on IdentityEvents, this led to a view of a multi month attack by a threat actor/s against a tenent, followed by exploring the rabbit hole of logs and computer systems. This blog summarises some of the methods and findings when considering threat hunting and authentication defences for Office 365. (bear with me I am tired so this might need a bit of a tune up later!)

Here is where it all started:

IdentityLogonEvents

| where TimeGenerated > ago(90d)

| where ActionType contains “failed”

| sort by TimeGenerated desc

| summarize count() by bin(TimeGenerated, 1h), AccountName, FailureReason

| sort by count_

| render columnchart

Chart, histogram

Description automatically generated

We enriched the IP data with IPINFO:

Graphical user interface, text, application

Description automatically generated
Graphical user interface, application

Description automatically generated

We also looked at the dataset in GreyNoise:

Text

Description automatically generated

As you can imagine, I wanted to make sure the services were safe and that no unauthorized access had occurred.

SigninLogs

| where TimeGenerated > ago(90d)

| where ResultType == “50053”

| project TimeGenerated, UserId, UserPrincipalName, UserType ,ResultDescription, ResultType, Location, AppDisplayName, SourceSystem, IPAddress

| summarize count() by bin(TimeGenerated, 1h), UserId

| sort by count_

| render columnchart

Chart, histogram

Description automatically generated

Advanced Hunting/Sentinel Data Sources

  • IdentityLogonEvents
  • SignInLogs

You will also see activity in the UAL:

https://security.microsoft.com/cloudapps/activity-log

Controls

Security – Microsoft Azure

Authentication methods – Microsoft Azure

Example Queries

SignInLogs

These are the primary location to analyse.

SigninLogs

| where TimeGenerated > ago(90d)

| where ResultType == “50053”

| project TimeGenerated, UserId, UserPrincipalName, UserType ,ResultDescription, ResultType, Location, AppDisplayName, SourceSystem, IPAddress

| summarize count() by ResultDescription, ResultType, Location

| sort by count_ desc

Image

A key element here is that the eventType 50053 and descriptions will appear slightly differently depending on what log you review.

SignInLogs will show SmartLock blocking the accounts. “Account is locked because user tried to sign in too many times with an incorrect ID or password.”

The wording here is not incredibly clear here, but this is SMART LOCKOUT working. This WILL NOT cause a denial of service to legitimate users or existing sessions.

SigninLogs

| where TimeGenerated > ago(90d)

| where ResultType == “50053”

| project TimeGenerated, UserId, UserPrincipalName, UserType ,ResultDescription, ResultType, Location, AppDisplayName, SourceSystem

| summarize count() by ResultDescription, ResultType, Location

Identity Logon Events

IdentityLogonEvents

| where TimeGenerated > ago(90d)

| where ActionType == “LogonFailed”

| where LogonType == “OAuth2:Token”

| summarize count() by bin(TimeGenerated, 1d), AccountName

| render columnchart

IdentityLogonEvents

| where TimeGenerated > ago(90d)

| where ActionType != “LogonFailed”

| where LogonType == “OAuth2:Token”

| summarize count() by bin(TimeGenerated, 1d), AccountName

| render columnchart

IdentityLogonEvents

| where TimeGenerated > ago(90d)

| where ActionType != “LogonFailed”

| where LogonType == “OAuth2:Token”

| summarize count() by bin(TimeGenerated, 1d), Location

| render columnchart

IdentityLogonEvents

| summarize count() by ActionType, Application, LogonType

| sort by count_ desc

IdentityLogonEvents

| where TimeGenerated > ago(90d)

| summarize count() by ActionType, Application, LogonType, AccountName, AccountUpn

| sort by count_ desc

IdentityLogonEvents

| where TimeGenerated > ago(2h)

| sort by TimeGenerated desc

| where ActionType != “LogonSuccess”

//| project TimeGenerated, ActionType, Application, LogonType, AccountName, AccountDomain, Location, ISP, IPAddress, FailureReason

//| project TimeGenerated, ActionType, Application, LogonType, Location, ISP, FailureReason

| sort by TimeGenerated desc

//| summarize count() by bin(TimeGenerated, 1d), Location

//| render columnchart

Testing Tools

0xZDH/o365spray: Username enumeration and password spraying tool aimed at Microsoft O365. (github.com)

0xZDH/Omnispray: Modular Enumeration and Password Spraying Framework (github.com)

References

Smart Lockout & Logging

Heads up: you need an Azure AD Premium P1 license for this feature:

Configure custom Azure Active Directory password protection lists – Microsoft Entra | Microsoft Learn

Ok I’m not going to dissect every packet but when we authenticate to Office 365 from the internet this is a probable pattern:

  • Our DNS client request resolution.
  • An ANYCAST IP is returned
  • Out CLIENT attempt to connect, the nearest regional Datacenter point responds (global and fast)
  • We then attempt to authenticate
  • This is where the SMART LOCKOUT process comes into effect.

Threat actors often use botnets, rotating proxy services and/or open redirects etc. So a threat actor can easily send loads of requests from thousands/hundreds of thousands of IPs if they so have the motivation, means, access etc. This is where smart lockout is going to help us, but it will obviously look “very interesting” from a logging point of view (if you go looking).

It’s useful to know how this logging works, it’s useful threat intelligence, just because a threat actors attack hasn’t worked:

  1. It may do in the future
  2. You may be being targeted
  3. You may want to assure controls and validate configuration for targeted (and other) identifies in your environment

Summary

Late at night, when I first saw 50k logon attempts I clearly realised this needed investigating, the interesting thing was there were no alerts/incidents. So I had a few questions:

  • Where is this coming from?
  • Why are they not alerting?
  • Are the accounts being locked out and is this causing Denial of Service (DoS)?
  • Can we block these?
  • Can we prove the identifies only have legitimate access?

After running through a range of activity, learning, and testing I can now answer these questions. I’ve put this rapid publish post together to help other people.

  • Where is this coming from? An unknown threat actor/s possibly in the Eastern area of the globe (based on percentage of source traffic from China and India + time correlation) (however this is not confirmed)
  • Why are they not alerting? The volume of events would create an unmanageable number of alerts, the authentication attempts were not successful as well.
  • Are the accounts being locked out and is this causing Denial of Service (DoS)? Partially, it’s stopping the attackers from signing in (even if they had valid credentials (we do not believe they do))
  • Can we block these? They are being blocked
  • Can we prove the identifies only have legitimate access? Yes, we have checked and believe these are all “not compromised”

So, great stuff, the only thing I would like however is for the log descriptions to be a bit clearer, it wasn’t immediately clear that SMART LOCKOUT was causing the locks and then the question of DoS came into my mind, it took me quite some effort to get round to the “oh ok that’s not causing DoS”.

We have layered controls here, but the fact remains someone is sending a large volume of authentication attempts to a small number of identifies for several months. Without looking at these logs I would never have known that…. Some might argue ignorance is bliss… but I’ve not really found that to be the case with cyber security.

Also CHATGPT – https://twitter.com/SU1PHR/status/1612820176528936962?s=20&t=fsqfLpruWTvOBsS4hRoUzQ

Maybe I need to hang up my blogging hat….