Advana Guide — Federal DS Handbook

Platform Overview

Advana — short for Advancing Analytics — is the Department of Defense's enterprise-wide data, analytics, and artificial intelligence platform. It started inside the Office of the Secretary of Defense Comptroller as a financial audit tool. The DoD had been failing independent audits for years, and the root cause kept coming back to the same problem: DoD data lived in thousands of incompatible systems, and no one could pull a consistent picture of anything. Advana was the attempt to fix that.

It grew fast. By 2024, it had grown to somewhere between 80,000 and 100,000 users, 400-plus connected Pentagon business systems, over 3,000 NIPRNET data sources ingested, and 250-plus applications in production across 55-plus DoD organizations.

The organizational home matters. Advana lives under the Chief Digital and Artificial Intelligence Office (CDAO), which reports up through the DoD chain. The Deputy Secretary of Defense formally designated Advana as "the single enterprise authoritative data management and analytics platform" for OSD and all DoD components. When a program office tells you they want analytics built on authoritative DoD data, Advana is what that phrase typically points to.

The original prime contractor is Booz Allen Hamilton, working under a five-year, $674 million GSA contract awarded in 2021. The planned $15 billion follow-on recompete (AAMAC) was canceled in July 2025.

Statutory and Policy Foundation

Advana did not emerge from a vendor pitch. It has a statutory and policy basis that matters for understanding why the platform exists and why the mandate to use it carries weight.

FY18 NDAA Sections 911-913 required the DoD to improve financial management data transparency and auditability. Sections 911 and 912 established requirements for financial data standards and audit readiness. Section 913 directed the creation of an enterprise data analytics capability. Advana is the platform built to fulfill that direction.

Deputy Secretary of Defense Memorandum, May 5, 2021 formally designated Advana as "the single enterprise authoritative data management and analytics platform" for the Office of the Secretary of Defense and all DoD components. That memo gave Advana its enterprise mandate. Every DoD component is directed to use Advana as the authoritative data platform rather than building separate analytics infrastructure. The memo is publicly available at: https://media.defense.gov/2021/May/10/2002638551/-1/-1/0/DEPUTY-SECRETARY-OF-DEFENSE-MEMORANDUM.PDF

The January 2026 Hegseth restructuring memo is the most recent policy directive. It reaffirms the statutory basis and reorganizes the platform into the three tracks described in the Current State section below.

Getting Access

Access to Advana is not complicated, but it has real prerequisites: a CAC or PIV card, a sponsor who can justify your access, and patience with a form.

You will need a DD Form 2875, the System Authorization Access Request form. This requires a government sponsor — typically your contracting officer representative (COR) or the program office point of contact — to sign off on the business need and appropriate access level.

The Process

Navigate to https://advana.data.mil
Authenticate with your CAC certificate when prompted
If first access, the portal routes you to submit a Help Desk ticket
Attach your completed DD Form 2875 to that ticket
Wait — typical provisioning runs one to three weeks

Contractor access is available. You don't need to be military or a civilian government employee. What you do need is a valid CAC, a signed 2875, and a government sponsor. Finding that sponsor is usually the actual bottleneck, not the paperwork.

Advana is organized into ten community spaces — functional groupings of data and applications for specific mission areas. Your access request should specify which space or spaces you need. Requesting access to everything is not how this works; you need a justified business need for each space.

Available Tools

Visualization and Business Intelligence

Qlik Sense is the primary BI and visualization layer on Advana. Most of the pre-built dashboards that consumers and analysts use are built in Qlik. The platform supports Qlik's Server-Side Extensions (SSE), which lets the analytics layer simultaneously run Python, C++, or Java code, embedding statistical computations or ML inference calls directly into a Qlik dashboard without exporting data to a separate environment.

Tableau and Power BI are also available within the Advana ecosystem, though Qlik is the platform's native BI layer.

Data Science and Machine Learning

Databricks is the data lakehouse and ML platform within Advana. This is where data scientists doing serious modeling work will spend most of their time — Python and PySpark notebooks, Spark-based distributed processing, and access to the Unity Catalog data governance layer.

MLflow comes packaged with Databricks and handles experiment tracking, model registry, and deployment management. DataRobot and C3 AI provide automated ML capabilities. Amazon SageMaker is available for model training and deployment at scale.

Data Governance

Collibra is the data catalog and governance layer. Before you start pulling datasets, Collibra is where you verify what a field actually means, who owns it, when it was last updated, and whether the definition has changed. This is not optional discipline — DoD data has a persistent labeling problem, and Collibra is the official place to resolve ambiguity.

Development and Source Control

GitLab handles source code management and CI/CD pipelines within the Advana environment. Your model code, pipeline scripts, and dashboard configurations should live in GitLab, not on your local machine or an ad-hoc shared drive.

Data Access

What's Connected

Over 400 Pentagon business systems feed into Advana. The major ones data scientists encounter regularly:

GFEBS (General Fund Enterprise Business System) — Army financial management. Real-time streaming feeds with millisecond-level latency.
GCSS-Army (Global Combat Support System — Army) — Army logistics. Same real-time streaming pipeline as GFEBS.
DMDC — Defense Manpower Data Center. Personnel and human resources data.
SAM.gov Contract Data (formerly FPDS-NG) — Federal procurement and contract data across all DoD components. Migrated from FPDS.gov in February 2026.

Classification Levels

Network	Level	What Lives There
NIPRNET	Unclassified (CUI/FOUO) / IL2 / IL4	Business operations data; 3,000+ sources
SIPRNET	Secret / IL5	Classified analytics and force planning data
IL6 / JWICS	Classified / TS-SCI	Higher-classification operational planning

WDP (Advana) and Jupiter share the same underlying infrastructure stack, accredited at IL2, IL5, IL6, and JWICS. The IL5 and IL6 paths are separate from the older SIPRNET deployment and represent the expanded accreditation that followed the January 2026 restructuring.

Most financial, logistics, procurement, and personnel analytics work happens on NIPRNET at the CUI or FOUO sensitivity level, which maps to Impact Level 4. SIPRNET access requires a Secret clearance and a separate access request.

Data Science Workflows

The Four Tiers

Consumers get pre-built Qlik dashboards for their functional area. Analysts build custom dashboards in Qlik, combine data across sources using the Qlik mashup API. Data scientists work in Databricks notebooks — Python, PySpark, SQL, with MLflow for experiment tracking. Data engineers build and maintain pipelines, set up real-time replication feeds, manage data quality monitoring.

python

# Querying procurement data from Advana's Databricks environment
# Authentication happens via CAC-backed session token
from pyspark.sql import SparkSession
from pyspark.sql import functions as F

spark = SparkSession.builder.appName("procurement_analysis").getOrCreate()

# Read from the Unity Catalog — data is organized by domain
df = spark.sql("""
    SELECT
        contract_id,
        vendor_name,
        obligation_amount,
        award_date,
        naics_code,
        reporting_agency
    FROM advana_catalog.procurement.fpds_awards_fy2024
    WHERE reporting_agency = 'DEPARTMENT OF THE ARMY'
      AND obligation_amount > 1000000
""")

# Basic sanity check — procurement data has duplicate records
print(f"Total rows: {df.count():,}")
print(f"Unique contracts: {df.select('contract_id').distinct().count():,}")

# Check for the most common data quality issue: obligation amount sign errors
negative_obligations = df.filter(F.col("obligation_amount") < 0).count()
print(f"Negative obligations (deobligations or errors): {negative_obligations:,}")

python

# MLflow experiment tracking example — readiness model development
import mlflow
import mlflow.sklearn
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report

df = spark.table("advana_catalog.readiness.unit_readiness_daily").toPandas()

features = ["equipment_pct_mc", "personnel_fill_rate", "training_completion_pct",
            "parts_availability_score", "days_since_last_assessment"]
target = "c_rating"  # C1/C2/C3/C4 readiness rating

X = df[features].fillna(df[features].median())
y = (df[target].isin(["C1", "C2"])).astype(int)  # Binary: ready vs. degraded

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

with mlflow.start_run(run_name="readiness_rf_v1"):
    mlflow.log_param("n_estimators", 100)
    mlflow.log_param("max_depth", 8)
    mlflow.log_param("features", features)

    model = RandomForestClassifier(n_estimators=100, max_depth=8, random_state=42)
    model.fit(X_train, y_train)

    preds = model.predict(X_test)
    report = classification_report(y_test, preds, output_dict=True)

    mlflow.log_metric("precision_ready", report["1"]["precision"])
    mlflow.log_metric("recall_ready", report["1"]["recall"])
    mlflow.log_metric("f1_ready", report["1"]["f1-score"])

    mlflow.sklearn.log_model(model, "readiness_classifier")
    print(f"F1 (ready): {report['1']['f1-score']:.3f}")

Approved packages: You cannot pip install arbitrary packages from the internet in an IL4/IL5 environment. The Databricks environment has a curated list of approved packages. If you need something not on that list, the process involves submitting a request through the platform's software approval chain, which can take weeks. Build it into your project timeline.

Data Flow by Classification Level

graph TD A[Data Source Systems] --> B[Advana Ingestion Layer] B --> C{Classification Level?} C -->|Unclassified CUI/FOUO| D[NIPRNET - IL2/IL4] C -->|Secret / IL5| E[SIPRNET - IL5] C -->|TS-SCI / IL6| F[JWICS - IL6] D --> G[Databricks Notebooks] D --> H[Qlik Dashboards] D --> I[SageMaker] E --> J[Classified Analytics Apps] F --> K[TS-SCI Analytics Apps] G --> L[MLflow Model Registry] H --> M[Program Office Consumers]

Advana data flow by classification level. NIPRNET and SIPRNET are separate pipelines with separate access controls and compute environments.

Current State (2025–2026)

The Workforce Reduction

In February and March 2025, following Defense Secretary Hegseth's directive for 5–8% civilian workforce cuts, CDAO lost approximately 60% of its workforce. The losses fell hardest on the legacy DDS and JAIC teams that had built the original AI infrastructure. Contracted support staff was reduced by roughly 80%.

Advana itself was hit harder by the loss of institutional leadership than by raw labor reduction. The platform still runs, but the people who understood why specific architectural decisions were made are largely gone. The gap has been partially filled by borrowed government labor from other organizations, but that supplementation is ad hoc and varies by functional area.

The practical result: platform development slowed, some capabilities that required active maintenance degraded, and the institutional knowledge embedded in those positions walked out the door.

The AAMAC Cancellation

In July 2025, the planned 10-year, $15 billion recompete (the Advancing Artificial Intelligence Multiple Award Contract) was formally canceled. Booz Allen Hamilton's existing $647 million contract continues. The future contracting vehicle for Advana's expansion is unresolved as of early 2026.

The January 2026 Restructuring

On January 9, 2026, Defense Secretary Hegseth signed the "Transforming Advana to Accelerate Artificial Intelligence and Enhance Auditability" memo. It divides Advana into three tracks:

graph LR A[Current Advana] --> B[War Data Platform] A --> C[Advana for Financial Management] A --> D[WDP Application Services] B --> E[DoD-wide AI Foundation] C --> F[Audit Remediation FY27-28] D --> G[App Migrations + New AI Tools]

The January 2026 three-way restructuring of Advana. Financial management returns to the Comptroller. The War Data Platform becomes the AI development foundation.

War Data Platform (WDP) — The new data integration foundation. Designed to be DoD-wide, standardized, and built to support agentic AI use cases at scale.
Advana for Financial Management — Pulled back under the Office of the Under Secretary of Defense (Comptroller). Focused entirely on audit remediation (FY 2027 and FY 2028 clean audit targets).
War Data Platform Application Services — Consolidates all non-audit Advana applications. Manages migrations from legacy Advana to the new WDP.

Best Practices

"Code Low, Deploy High": The DevSecOps Pattern for WDP

The January 2026 restructuring memo references a deployment pattern that practitioners will encounter as the WDP matures: develop at lower classification levels, deploy at higher ones. The DoD Enterprise DevSecOps Reference Design (September 2022) documents this pattern formally.

The idea is straightforward. Your data scientists and ML engineers do their development work at IL2 or IL4 — where compute is cheaper, access is easier, and iteration is faster. When the code, model, and pipeline are validated, the deployment pipeline promotes them to IL5, IL6, or JWICS environments where the operational data actually lives. The CI/CD pipeline handles the cross-domain transfer through approved transfer mechanisms.

This pattern is not new — it is how defense software has been built for years — but the WDP restructuring formalizes it for the data and AI stack. For your work on Advana, this means: design your pipelines to be classification-agnostic from the start. Keep business logic separate from classification-specific configurations. Build your CI/CD pipeline to validate at IL2/IL4 and promote to higher levels, rather than developing directly in a classified environment where iteration cycles are measured in days, not hours.

The DoD Enterprise DevSecOps Reference Design is available at: https://dodcio.defense.gov/Portals/0/Documents/Library/DoDRefDesignCloudGithub.pdf

What Works

Start in Collibra before you write a single query. An hour in Collibra before you touch the data saves you two weeks of debugging a model trained on a field that was redefined six months ago.

Use Qlik SSE when the end user is a dashboard consumer, not a data scientist. Qlik SSE lets you embed Python computations directly in a dashboard that program office staff can refresh themselves. Build it that way from day one.

GitLab everything. Treat your Advana work the same way you'd treat any production software: version-controlled, documented, reproducible.

Track the War Data Platform timeline. Tools and pipelines built on legacy Advana infrastructure will need to migrate.

Where This Goes Wrong

Failure Mode 1: Treating Advana like a commercial cloud platform

The mistake: Expecting to pip install packages freely and deploy code with standard DevOps practices. Submit your full package requirements list to the platform team at project kickoff, not when you need them.

Failure Mode 2: Building for today's Advana architecture

The mistake: Designing pipelines that depend on specific legacy Advana infrastructure components scheduled for migration to the War Data Platform. Build modular pipelines with clean interfaces — when the migration happens, you swap the ingestion component, not the entire stack.

Failure Mode 3: Underestimating access lead time

Advana provisioning takes one to three weeks minimum, often longer for sensitive community spaces. Submit 2875s before the contract starts. If you cannot do that, the first sprint should be architecture, Collibra data discovery, and training — not analysis that requires live data access.

Platform Close

The one thing to remember: Advana is the DoD's single authoritative data aggregation platform for cross-component analytics, but it is in a significant transition period — the War Data Platform restructuring will change how data access and AI applications are built on top of it, and practitioners who understand that transition will make better architecture decisions than those who treat today's platform as permanent.

What to do Monday morning: If you have a DoD analytics assignment coming up, submit your DD Form 2875 this week — before the contract starts. Spend an hour in the Advana University materials at dau.edu before you touch a notebook. Pull up the January 9, 2026 Hegseth restructuring memo at media.defense.gov and read the WDP requirements section.