Reporting on Microsoft 365 mailbox item count and size by year via the Graph API

Happy new year! For the first article of 2024, we will cover the “give me a breakdown of items within my mailbox, by age” question that pops up semi-regularly on the forums. The usual answer I give to such queries is to use an EWS-based script, such as the one by Glen here. While this method will do, you might also consider using a Graph-based variant, as after all Microsoft has been trying to get us off using EWS for years now. Unfortunately, I haven’t been able to find any such Graph-based script. This script by Nuno is the closest I’ve seen to such a solution, but it only addresses calendar scenarios. So, as an exercise, let’s build a Graph-based PowerShell script that will give us a breakdown of the item count and size per folder within a given (or a set of) mailbox within Exchange Online.

As usual with all my Graph code samples, the first step is to handle authentication. For the purposes of the current script, we will need to query the user object, then enumerate all the mailbox folders and get some statistics on the items held within. In other words, we need the User.Read.All permission, as well as the Mail.ReadBasic.All one (Mail.Read is needed if you also want to report on size, read below). I’d strongly recommend to use the application permissions model on this one, otherwise you will only be able to run the script against your own mailbox, and any mailboxes you have been granted Full access permissions against. But as always, feel free to replace the authentication section (lines 77-102) with your own preferred method of obtaining an access token.

Now, one of the biggest downsides of using the Graph is its lack of support for any Exchange Online admin-level operations, such as listing all mailboxes within the organization. Heck, we still don’t even have a proper way to check whether a given user has a mailbox, and the closest we can get is to leverage the userPurpose property of the mailboxSettings resource. Which in turn translates into having to issue additional Graph API requests for each user you want to generate the item age report for. Far from ideal, but it’s doable. This is also the reason why the script requires the MailboxSettings.Read permissions.

After handling authentication, the script will parse the set of user(s) provided as input, and verify whether a matching Exchange Online mailbox exists. The –Mailbox script parameter accepts multiple values (by a comma-separated string or list object), and you can use either the user’s UserPrincipalName or GUID as the identifier. Unfortunately, you cannot use other identifier that you might be accustomed with when working with the Exchange tools, such as smtp addresses, alias, distinguishedName and so on. For each entry provided, a request against the /mailboxSettings endpoint will be made and if no valid response is returned, the corresponding entry will be removed from the list.

Once we have the set of mailboxes to process, the script will iterate over each mailbox. Instead of enumerating all the folders, we will leverage Graph’s filtering capabilities to only return folders with one or more items within them. In addition, we will leverage the /beta /mailFolders Graph API endpoint, as it returns subfolders within the result, thus saving us from having to use recursion to iterate over each (sub)folder. Here’s how a sample query looks like:

GET https://graph.microsoft.com/beta/users/user@domain.com/mailFolders?&includeHiddenFolders=true&$top=999&$filter=totalItemCount gt 0

With that, we now have a list of all (sub)folders within the given mailbox which contain one or more items. Next, the script will leverage the Process-Folder helper function to provide the breakdown of items within the folder, by year. Here, we are again benefiting from Graph’s robust filtering capabilities, which allow us to get all the required information by just a handful of queries. First, we find the “oldest” message within the mailbox, and thus determine the “start” year. Next, we run “incremental” filters by year, to get all the messages within the given time period. Another trick we use here is to combine the $top=1 and $count=true query parameters, effectively getting all the desired data with minimal overhead and without needing to fetch and iterate over all of the items within a given (sub)folder. That’s only valid if you want to report on item count though, as we discuss below.

GET https://graph.microsoft.com/v1.0/users/{userId}/mailFolders/{folderId}/messages?$top=1&$orderby=createdDateTime asc&$select=id,createdDateTime

GET https://graph.microsoft.com/v1.0/users/{userId}/mailFolders/{folderId}/messages?$top=1&$filter=createdDateTime+ge+$startDate+and+createdDateTime+lt+$endDate&$count=true&$select=id,createdDateTime

The rest of the script handles the output. For each mailbox, a list of folders with more than one item is returned, along with a breakdown of the number of items per year. If you opted to include the item size (via the –IncludeItemSize paramater), it will be shown as well. Otherwise, expect to see a zero (0). By default, the output is one line per mailbox/folder/year combo, which should allow for easier manipulation. Alternatively, you can use the –CompactOutput switch, which will produce a single line per mailbox/folder. A semicolon-concatenated string is used for the per-year breakdown, for example: “2022=100:123456;2023=200:456789;2024=0:0”. The output is then written to a CSV file within the working directory, which you can use to perform additional manipulations of the data.

Before showing some examples, let’s also introduce the set of script parameters:

  • Mailbox – used to designate the mailbox(es) against which to run the script. Technically, you provide a user identifier, so either an UPN or GUID value. Multiple values can be provided by comma-separated string, or a list variable.
  • IncludeItemSize – switch parameter, used to include statistics about the total size of items per year. Do note that including this switch will cause the script to run a lot slower, as multiple additional queries are required and the metadata of all items within the mailbox is retrieved. In addition, using this switch changes the script permission requirements to include the Mail.Read scope instead of Mail.ReadBasic.All, as this is needed in order to include the “size” property.
  • CompactOutput – switch parameter, used to change the output to one line per mailbox/folder combo.
  • Verbose – switch parameter, used to spill out additional details about the script execution, as well as “preview” the folder stats within the console.

You can download the script from my GitHub repo. Before running it, make sure you update the authentication-related variables (lines 77-81). Again, application permissions are recommended, and the corresponding app must have the following grants: User.Read.All, MailboxSettings.Read and Mail.ReadBasic.All (or Mail.Read if using the –IncludeItemSize switch). Once the prerequisites have been made, you can run the script by leveraging one of the examples below:

#Use the -Mailbox parameter to specify the mailbox for which to generate the report
.\Mailbox_Messages_PerYear.ps1 -Mailbox user@domain.com

#The -Mailbox parameters accepts comma-separated list of mailboxes
.\Mailbox_Messages_PerYear.ps1 -Mailbox user1@domain.com,user2@domain.com,...

#Alternatively, you can use a CSV file as input, or leverage other cmdlets
.\Mailbox_Messages_PerYear.ps1 -Mailbox (Import-Csv blabla.csv).UserPrincipalName
.\Mailbox_Messages_PerYear.ps1 -Mailbox (Get-User -Filter {Department -eq "Sales" -and RecipientTypeDetails -eq "UserMailbox"}).UserPrincipalName

#Use -IncludeItemSize to gather item size stats. Do note this impacts the script performance!
.\Mailbox_Messages_PerYear.ps1 -Mailbox user@domain.com -IncludeItemSize

#Use -CompactOutput to change the output format to one line per mailbox/folder combo
.\Mailbox_Messages_PerYear.ps1 -Mailbox user@domain.com -CompactOutput

#Use the -Verbose switch to output additional information, including per-folder stats in "human-readable" format
.\Mailbox_Messages_PerYear.ps1 -Mailbox user1@domain.com,user2@domain.com -Verbose

The first screenshot below illustrates the type of output you can expect to see when running the script with the –Verbose parameter. The second one shows how the CSV output file will look like by default. A filter was applied against the “Year” and “Item count” columns, so that only folders containing at least one message from 2023 are displayed. Both examples also leverage the –IncludeItemSize parameter.

MailboxItemsPerYear3

MailboxItemsPerYear2

Before closing the article, a few notes. The script currently uses the createdDateTime property to determine the age of an item, and while this should be accurate, in some cases you might be interested in other properties, such as the receivedDateTime or sentDateTime. Keep in mind that such properties might have null values for messages in the Drafts folder, or other items. But if needed, you can always adjust the filters used to leverage said properties instead.

Another thing to keep in mind is that the Folder names are not unique. While the script can be modified to include the full folder path, this will requires some additional queries, so I opted to include the unique FolderId instead. And, the script will not include any results from any Online archive mailboxes, as the Graph API does not currently support working with such. Speaking of things the Graph does not currently support, any IPM.Appointment items are NOT returned, thus you can expect some discrepancies between the total number of items in the folder (as reported by the totalItemCount property on the folder object) and the sum of per-year item counts (returned by /message endpoint queries, which do NOT include appointments/tasks).

As always, consider this script as “proof of concept” and feel free to make any changes as you see fit. For example, you might want to do a breakdown per month, instead of (or in addition to) per-year? And lastly, the customary warning. Do not use the script against production environments, unless you fully understand what it does, and even so, do make sure to add some proper error handling and better throttling controls. Feedback is always welcome!

1 thought on “Reporting on Microsoft 365 mailbox item count and size by year via the Graph API

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.