Overview

NebulaStream (NES) supports Multi-Query Optimization (MQO) by merging query plans from different queries together. This feature can be enabled or disabled by setting the configuration for enableQueryMerging to true or false (default) while starting the NES framework (refer for more information).

Query Merging: is a feature within NES that allows performing compute and data sharing among concurrently running queries. There are different query merging rules defined within the system. However, the process of deployment and undeployment of the Global Query Plan remains the same after applying any combination of query merging rules. Next we define, some of the common data structures used while performing the merging, the deployment and undeployment process after performing query merging, and various query merging rules present in NES.

Data Structure Overview

Global Query Plan

The main data structure prepared after applying query merging rules is a Global Query Plan (GQP). A GQP consists of plans for all running queries within the system. These plans are merged together based on query merging rules applied to them. In the below example, we show an example GQP with 5 running queries. As can be observed, there are three separate query plans within the example GQP.

Global Query Metadata

To maintain the query plan merging information within a GQP we maintain another important data structure called Global Query Metadata. Here we store information about which queries are merged together to perform data and compute sharing. Here, we also generate a unique Global Query Id (GID) per isolated query plan. The GID is used for transmitting changes at the execution layer whenever a specific query plan is changed within a GQP. Additionally, whenever a new isolated query plan gets added to the GQP a new GID is assigned to it. In the below example, there are three entries each with a unique GID. As can be observed, Q5 is not merged with any other query but has been assigned a unique QID 3.

Query To Global Query Node Map

As multiple queries can share a single operator, we maintain a map of query id to the list of Global Query Node used by the query. This allows us to appropriately update Global Query Nodes while manipulating a query.

Deploying a New Query

We explain the process of deploying a new query in this section.

  1. We apply a query merging rule on the existing GQP and newly added query.
  2. Once the new query is added or merged with the existing query plan within the GQP we accordingly update GQP, Global Query Metadata, and the Query to Global Query Nodes map.
  3. We then check in the Global Query Metadata if any of the entries are updated due to of Step 1 (in this example GID 1 is updated).
  4. For each updated Metadata entry, we first Undeploy the plan using the corresponding GID and then redeploy the updated plan with the same GID.

Undeploying a Query

We explain the process of undeploying an existing query in this section.

  1. We first remove the query from the GQP.
  2. We then update the corresponding data structures post removal of the query.
  3. We then check in the Global Query Metadata if any of the entries are updated due to of Step 1 and 2 (in this example GID 2 is updated).
  4. For each updated Metadata entry, we first Undeploy the plan using the corresponding GID and then redeploy the updated plan with the same GID.
 
query_merging.txt · Last modified: 2020/11/16 14:59 by 134.96.191.189
 
Recent changes RSS feed Creative Commons License Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki