flowchart LR
subgraph Sources
OP[On-premises]
CL[Cloud]
SA[SaaS]
end
Sources --> ING[Ingestion]
ING --> STR[(Storage)]
STR --> PRC[Processing]
PRC --> VIZ[Visualization]
OLAP systems are optimized for read-heavy analytical workloads. Data flows from operational sources through ingestion into a centralized store, gets processed, and is surfaced through visualization tools.
NoSQL Database Types
Type
Description
Example
Key-value
Data stored as key-value pairs
Azure Table Storage, Redis
Document
Data stored as JSON-like documents
Azure Cosmos DB, MongoDB
Column family
Data organized into rows and column families
Apache Cassandra, HBase
Graph
Data stored as nodes and edges representing relationships
Azure Cosmos DB (Gremlin API), Neo4j
Data Ingestion Pipeline
flowchart LR
DS[Data Sources] --> RD[(Raw Data)]
RD --> DP[Data Processing]
DP --> CD[(Cleaned Data)]
CD --> Q[Queries & Reports]
CD --> V[Visualizations]
Data Processing Services
Service
Purpose
Azure Functions
Event-driven serverless compute for lightweight processing
Azure Cognitive Services
AI-powered data enrichment (vision, language, speech)
Azure Databricks
Apache Spark-based analytics for large-scale data processing
ETL and ELT Patterns
ETL (Extract, Transform, Load) — data is transformed before loading into the destination:
flowchart LR
E[Extract] --> T["Transform<br/>(basic filtering)"]
T --> L[Load]
ELT (Extract, Load, Transform) — data is loaded first, then transformed in the destination system:
flowchart LR
E[Extract] --> L[Load]
L --> T["Transform<br/>(complex processing)"]
ELT is preferred when the destination has sufficient compute power for transformations (e.g., data warehouses, Azure Synapse Analytics).
Azure SQL Data Services
Service Overview
flowchart TB
subgraph IaaS ["Infrastructure as a Service"]
SQLVM["`**SQL Server on Azure VMs**
Best for re-hosting and apps
requiring OS-level access`"]
end
subgraph PaaS ["Platform as a Service"]
MI["`**Azure SQL Managed Instance**
Best for modernizing
existing apps`"]
DB["`**Azure SQL Database**
Best for building
new cloud apps`"]
end
Migrate to cloud fast while maintaining OS control and complete SQL Server functionality
Solution
Combined performance, security, and analytics backed by the flexibility and hybrid connectivity of Azure
Key features
Full SQL Server and OS access; expansive SQL and OS versions; Windows, Linux, Containers; file stream, DTC, Simple Recovery; SSAS, SSRS, SSIS
Azure differentiators
Extended Security Updates for SQL Server 2008/R2; automated backups; Point in Time Restore with Azure Backup; accelerated storage with Azure Blob Caching; 435% overall return on Azure IaaS investment over five years
SQL Server VM Deployment Choices
Category
Options
Deployment
Marketplace pre-installed on Windows or Linux; install your own; Lift and Shift with Azure Migrate (Azure Site Recovery)
Resource Provider
Unlock licensing and edition flexibility; automated backups and security updates; manage VMs through Azure SQL in portal
Sizes and Storage
Memory or Storage optimized sizes; Tempdb on local SSD; data log on Premium Storage Managed Disks; ultra disks for extremely low latency; Azure Blob Read Caching for data disks
Networking and Security
Virtual Networks to integrate with on-premises; Advanced Data Security services (Preview)
HADR
Azure VM built-in HA; Azure Storage built-in DR; Failover Cluster Instance with Azure Premium File Share; Always On Availability Groups with Cloud Witness; Hybrid Availability Group secondary replicas; HADR on RedHat Linux with Pacemaker and fencing; Azure Backup and automated backups to Azure Blob Storage; File-Snapshot Backups
Azure SQL Database
Aspect
Details
Customer challenge
Build modern, multi-tenanted apps with highest uptime and predictable performance
Solution
Highly scalable cloud database service with built-in HA and machine learning
Key features
Single database or elastic pool; Hyperscale storage (100TB+); serverless compute; fully managed service; Private Link support; high availability with AZ isolation
Azure differentiators
Industry highest availability SLA of 99.995%; business continuity SLA with 5-second RPO and 30-second RTO; price-performance leader costing up to 86% less than AWS RDS (GigaOm)
Azure SQL Managed Instance
Aspect
Details
Customer challenge
Migrate to cloud, remove management overhead, but need instance-scoped features (Service Broker, SQL Server Agent, CLR)
Solution
Combines leading security features with SQL Server compatibility and business model designed for on-premises
Key features
Single instance or instance pool; SQL Server surface area (vast majority); native virtual network support; fully managed service; on-premises identities with Azure AD and AD Connect
Azure differentiators
Near zero downtime migration using log shipping; fully managed business continuity with failover groups; projected ROI of 212% over three years; the best of SQL Server with benefits of a managed service
Managed Instance vs SQL Database
Feature
Azure SQL Managed Instance
Azure SQL Database
Single instance
SQL Server surface area (vast majority); native VNet support; fully managed
Single database; Hyperscale storage (up to 100TB); serverless compute; fully managed
Instance pool
Pre-provision compute resources for migration; enables cost-efficient migration; ability to host smaller instances (2 VCore); currently in public preview
Connections route through a gateway cluster using the proxy connection policy. The gateway redirects the client to the database node hosting the data (redirect-find-db).
Authentication and Access Control
Mixed Mode authentication is enforced. SQL Auth for deployment uses the server admin principal:
Server admin — server-level principal for logical server (DB) and member of sysadmin server role (MI)
Azure AD Authentication options:
Service
Supported Identity Types
Azure SQL Managed Instance
Azure AD Server Admin; SQL or Azure AD Logins; Database Users; SQL Server Contained Database supported
Azure SQL Database
Azure AD Server Admin; SQL logins; loginmanager and dbmanager roles for limited server admins; Database Users; Contained Database Users including Azure AD (recommended)
Network Security
flowchart TB
subgraph EUS ["US EAST — VMVNET-EUS"]
VM1["Azure VM<br/>10.0.0.3"]
end
subgraph SQLEUS ["US EAST — SQLDBVNET-EUS"]
VM2["Azure VM<br/>10.0.0.2"]
VM3["Azure VM<br/>10.0.0.5"]
end
subgraph WUS ["US WEST — VMVNET-WUS"]
VM4["Azure VM<br/>10.0.0.1"]
end
EUS <-->|VNET Peering| SQLEUS
VM3 -->|Private Endpoint| PL["mysqldbsrv.privatelink<br/>.database.windows.net<br/>10.0.0.5"]
VM4 --> GW[VNET Gateway]
GW -->|"Vnet2Vnet / P2S / S2S VPN<br/>Express Route"| ONPREM[On-premises]
Network security layers for Azure SQL Database:
Allow access to Azure services — permit connections from other Azure resources
Firewall Rules — IP-based access control
Virtual Network Rules — restrict access to specific VNet subnets
Private Link — access via private endpoint within VNet; public endpoint blocked (mysqldbsrv.database.windows.net has no internet access)
Azure Role Based Access Control (RBAC)
All Azure operations for Azure SQL are controlled through RBAC
Security principal and role-based system
Scope includes subscription, resource group, and resource
Decoupled from SQL Security (today)
Applies to operations in Azure portal and CLI
Allows for separation of duties for deployment, management, and usage
Azure locks help protect resources from delete or read-only
Built-in Azure SQL roles reduce need for owner
Built-in role
Scope
SQL DB Contributor
Database operations
SQL Managed Instance Contributor
Managed Instance operations
SQL Security Manager
Security configuration
SQL Server Contributor
Server-level operations
Read Replicas
Read replicas improve performance and scale of read-intensive workloads such as BI and analytics
Consider the read replica feature when delays in syncing data between master and replicas are acceptable
Create a replica in a different Azure region from the master for a disaster recovery plan
Data storage on replica servers grows automatically without impacting workloads
Create up to five replicas of the master server for Application, BI and Analytics Reporting, and Dashboard workloads
Non-Relational Data
Blob Storage Types
Type
Max Size
Use Case
Details
Block blobs
4.7 TB
Large, discrete binary objects that change infrequently
Individual blocks up to 100 MB; up to 50,000 blocks per blob
Page blobs
8 TB
Random read/write access
Organized as a collection of fixed-sized 512-byte pages; used for virtual disk storage (VMs)
Append blobs
195 GB
Optimized for append operations
Individual blocks up to 4 MB; a block blob variant
Cosmos DB Use Cases
Scenario
Description
Web and retail
Multi-master replicated model alongside Microsoft’s e-commerce; data architecture to support web and mobile with sub-10ms response times; in-game stats, social media integration, and high-score leader boards
Gaming
Engineers perform graphical processing on mobile/console clients but rely on the cloud for personalized and customizable content
IoT
Hundreds of thousands of devices designed and sold to sense remote data across Internet of Things (IoT) devices; using technologies like Azure IoT Hub, Data Engineers can easily design data solutions that capture real-time data; Cosmos DB can accept and store this information very quickly