Aggregation Operations

Aggregation operations process multiple documents and return computed results. You can use aggregation operations to:

Group values from multiple documents together.
Perform operations on the grouped data to return a single result.
Analyze data changes over time.
Query the most up-to-date version of your data.

By using the built-in aggregation operators in MongoDB, you can perform analytics on your cluster without having to move your data to another platform.

Get Started

To perform aggregation operations, you can use:

Aggregation pipelines, which are the preferred method for performing aggregations.
Single purpose aggregation methods, which are simple but lack the capabilities of an aggregation pipeline.

You can run aggregation pipelines in the UI for deployments hosted in MongoDB Atlas.

Aggregation Pipelines

An aggregation pipeline consists of one or more stages that process documents. These documents can come from a collection, a view, or a specially designed stage.

Each stage performs an operation on the input documents. For example, a stage can $filter documents, $group documents, and calculate values. The documents that a stage outputs are then passed to the next stage in the pipeline.

An aggregation pipeline can return results for groups of documents. You can also update documents with an aggregation pipeline using the stages shown in Updates with Aggregation Pipeline.

Note

Aggregation pipelines run with the db.collection.aggregate() method do not modify documents in a collection, unless the pipeline contains a $merge or $out stage.

Aggregation Pipeline Example

The following example pipeline uses documents from the sample data available in MongoDB Atlas, specifically the sample_training.routes collection. In this pipeline, we'll find the top three airlines that offer the most direct flights out of the airport in Portland, Oregon, USA (PDX).

First, add a $match stage to filter the documents to flights that have a src_airport value of PDX and zero stops:

{
   $match : {
      "src_airport" : "PDX",
      "stops" : 0
   }
}

The $match stage reduces the number of documents in our pipeline from 66,985 to 113. Next, $group the documents by airline name and count the number of flights:

{
   $group : {
      _id : {
         "airline name": "$airline.name",
      }
      count : {
         $sum : 1
      }
   }
}

The $group stage reduces the number of documents in the pipeline to 16 airlines. To find the airlines with the most flights, use the $sort stage to sort the remaining documents in descending order:

{
   $sort : {
      count : -1
   }
}

After you sort your documents, use the $limit stage to return the top three airlines that offer the most direct flights out of PDX:

{
   $limit : 3
}

After putting the documents in the sample_training.routes collection through this aggregation pipeline, the top three airlines offering non-stop flights from PDX are Alaska, American, and United Airlines with 39, 17, and 13 flights, respectively.

The full pipeline resembles the following:

db.routes.aggregate( [
   {
      $match : {
         "src_airport" : "PDX",
         "stops" : 0
      }
   },
   {
      $group : {
         _id : {
            "airline name": "$airline.name",
         }
         count : {
            $sum: 1
         }
      }
   },
   {
      $sort : {
         count : -1
      }
   },
   {
      $limit : 3
   }
] )

For runnable examples containing sample input documents, see Complete Aggregation Pipeline Examples.

Learn More About Aggregation Pipelines

To learn more about aggregation pipelines, see Aggregation Pipeline.

Single Purpose Aggregation Methods

The single purpose aggregation methods aggregate documents from a single collection. The methods are simple but lack the capabilities of an aggregation pipeline.

Method	Description
`db.collection.estimatedDocumentCount()`	Returns an approximate count of the documents in a collection or a view.
`db.collection.count()`	Returns a count of the number of documents in a collection or a view.
`db.collection.distinct()`	Returns an array of documents that have distinct values for the specified field.

Back

Tailable Cursors

Aggregation Pipeline