Fetching Activity Explorer data via the Export-ActivityExplorerData cmdlet

The Activity Explorer tool has been available in the Microsoft Purview portal for a while now, and among other things it covers Retention and Sensitivity label events. The corresponding events can of course be found as part of the Unified audit log, but the Activity Explorer tool provides curated and enriched information, without all the “noise” events. As such, the tool definitely has some merits, but in a busy tenant the UI approach is usually not the best choice. An Export functionality is provided within the portal, and now an equivalent PowerShell cmdlet helps you get all the data exported automagically and without worrying about the UI limits.

Microsoft released the Export-ActivityExplorerData cmdlet in Preview back in August, as part of Roadmap item #93281. Now, just in time for Ignite, the cmdlet has matured to GA status and can be run after connecting to the Security and Compliance Center endpoint. As pointed by the roadmap item description, the cmdlet is not subject to the 10000 rows limit that the UI Export functionality suffers from. That, and being able to get the data in automated manner should be all the reasons you need to start using it. So let’s see whether this method is an adequate replacement for the UI.

We start by generating a handful of events, such as applying a sensitivity label to a Word document and an email, across the web apps and the Office client. We also apply protection (via separate RMS template) to the file, downgrade the label, etc. While the corresponding events should start appearing shortly under the Activity Explorer UI, I’ve given it a day to make sure all the actions are properly captured (spoiler alert, they aren’t). Here’s a sample of what the UI shows (note that only a handful of columns can be fit on the screen):

To see how the Export-ActivityExplorerData cmdlet compares, we first open a session to the SCC endpoint via Connect-IPPSession, and then collect the data by providing a simple date filter. The cmdlet itself does support additional filters, albeit in a bit of a convoluted way, as detailed in the official documentation. We also specify JSON as the output format and store it all in a variable for reuse:

$res = Export-ActivityExplorerData -StartTime "18 Oct 2022" -EndTime "19 Oct 2022" -OutputFormat JSON

If you take a look at the output (as stored in the $res variable), you will find the bulk of the data crammed into a single property, namely ResultData. Some other properties give you a clue as to the number of results retrieved (RecordCount), the total number of results available (TotalResultCount) and whether this is the last “page” of results (LastPage). A single page is limited to maximum of 5000 results, with 100 returned by default. To retrieve multiple pages, use the –PageCookie parameter and pass the value you previously retrieved via the WaterMark property. As mentioned above, you can also use a set of filter parameters (-Filter1 through –Filter5) to narrow down the results, in addition to the required datetime filter (-StartTime and –EndTime). Lastly, you can also specify the output format as either CSV or JSON via the –OutputFormat parameter.

To review the records retrieved, we can use the ResultData blob and convert it from JSON to custom PS object. This in turn allows us to sort, group, filter and so on. Do note that individual events will corresponding to different activities (i.e. File read or Label applied), with different metadata/properties, which makes working with the output trickier. We can however still do things right in PowerShell:

#sort the output based on datetime
(($res.ResultData) | ConvertFrom-Json) | sort Happened

#group by operation type
($res.ResultData | ConvertFrom-Json) | group Activity -NoElement

Count Name
----- ----
2 File read
3 Label applied
1 Label changed

#preview the results via Out-GridView
, (($res.ResultData) | ConvertFrom-Json) | ogv

#same as above
($res.ResultData) | ConvertFrom-Json | select -ExpandProperty SyncRoot | ogv

So, a total of six events were found during the time period in question, matching the number of events we can see in the Activity Explorer UI. The set of properties is largely the same, with some of them changing names (such as “Workload” instead of “Location”), or being grouped together (such as EmailInfo representing a grouping of Sender, Receivers and Subject values). The biggest (and most annoying) difference is the value of the SensitivityLabel property – while the UI shows the “human readable” identifier, the data returned from the cmdlet gives only the GUID, forcing you to run additional queries.

So, the cmdlet output is more or less in parity with the information exposed within the Activity Explorer tool, which makes it suitable for automation scenarios. Since the data is curated to cover only specific scenarios though, you still might need to consider using the Unified audit log/Management activities API to fetching any relevant events instead, especially if you are interested in more context around them. For example, the single LabelApplied event for the Exchange workload we have in the output above does not reveal the IP address of the actor, but the corresponding Create event for the email message will happily show it. In fact, a SensitivityLabelApplied event is also visible for the same email within the Unified audit log, whereas the Activity Explorer failed to surface it. No events are captured in the Activity Explorer for a scenario where we are emailing a protected document either, and while the Unified audit log does not have a standalone event for this scenario either, you can correlate a Create event with FileAccessedExtended coming from the Exchange Online client application within the same time frame to take an educated guess as to what happened. Lastly, neither method captured the event of protecting a document with an RMS template, so gaps remain within Microsoft’s coverage even now.

In any case, providing a way to obtain any data without having to click around the UI is always a plus in my book, and for the case of the Activity Explorer, Microsoft has done a satisfactory job in providing the Export-ActivityExplorerData cmdlet. Before closing the article, here are some examples on how to use the set of –Filter parameters.

#Filter by Activity type
$res = Export-ActivityExplorerData -StartTime "18 Oct 2022" -EndTime "19 Oct 2022" -OutputFormat JSON -Filter1 @('Activity','LabelApplied','FileRead','LabelChanged')

#Filter by User
$res = Export-ActivityExplorerData -StartTime "18 Oct 2022" -EndTime "19 Oct 2022" -OutputFormat JSON -Filter1 @('User','vasil@michev.info')

#Filter by Activity type AND user
$res = Export-ActivityExplorerData -StartTime "18 Oct 2022" -EndTime "19 Oct 2022" -OutputFormat JSON -Filter1 @('Activity','LabelApplied','FileRead','LabelChanged') -Filter2 @('User','vasil@michev.info')

#Add a filter for the Workload too
$res = Export-ActivityExplorerData -StartTime "18 Oct 2022" -EndTime "19 Oct 2022" -OutputFormat JSON -Filter1 @('Activity','LabelApplied','FileRead','LabelChanged') -Filter2 @('User','vasil@michev.info') -Filter3 @('Workload','Exchange')

10 thoughts on “Fetching Activity Explorer data via the Export-ActivityExplorerData cmdlet

  1. Ross Screaton says:

    Hi Vasil,

    Any idea where I can find a list of valid filters (eg: Activity) and filter values (eg Activity types such as “File Copied to Cloud”? The cmdlet does not appear to align with the UI.

    Thanks!

    Reply
  2. joaquin says:

    Does anyone know if you can EXCLUDE using the filter vs INCLUDE? I cant seem to find any schema documentation related to all the possible activity events.

    Reply
  3. Teddy says:

    Hi guys,

    I’m unable to export or show more than 10,000 rows with -PageCookie.
    In fact, it showed only 100 rows even I entered -PageCookie with WaterMark value.
    Is there anything that I’m missing?

    Cheers,
    Teddy

    Reply
  4. veena says:

    Getting below Error

    Export-ActivityExplorerData:
    Line |
    19 | Export-ActivityExplorerData -StartTime “07/08/2022 07:15 AM” -End …
    | ~~~~~~~~~~~~~~~~~~~~~~~~~~~
    | The term ‘Export-ActivityExplorerData’ is not recognized as a name of a cmdlet, function, script file, or executable program. Check
    | the spelling of the name, or if a path was included, verify that the path is correct and try again.

    Reply
  5. Mar says:

    Hi Vasil,

    Great explanation!

    I’m having troubles trying to define the Start and End Time -> “Error: “String was not recognized as a valid DateTime.””

    I have tried to define it in multiple formats but I always have the same error, any idea what could be going on?

    Thanks!

    Reply
    1. Vasil Michev says:

      I stick to the format above, as in my experience it almost never causes trouble.

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.