Ethics, Governance, and Compliance for Federal AI
Documenting intent is not documentation. "The model was designed to be fair and unbiased" is not a fairness finding. The demographic parity ratio is the document. The monitoring owner's name is the document. The specific retraining trigger threshold is the document. This chapter covers the policy stack, NIST AI RMF, bias testing with code, model cards, Unity Catalog governance, and the checklist that separates deployed models from pilots.
The Federal AI Policy Stack
Federal AI governance operates under a layered policy stack. Understanding which policies apply to your program determines what documentation is required and which review gates you must clear.
- Executive Order 13960 (2020): Established principles for federal agency AI use — transparency, non-discrimination, accountability.
- Executive Order 14110 (2023): Required AI safety and security standards; directed NIST AI RMF adoption across agencies.
- Executive Order 14179 (2025): Shifted emphasis toward AI leadership and deployment acceleration while maintaining safety requirements.
- DoD AI Ethical Principles (2020): Responsible, Equitable, Traceable, Reliable, Governable — the "RETRG" framework for DoD AI systems.
- DoD Directive 3000.09: Human judgment in use-of-force decisions; applies to any system used in autonomous or semi-autonomous weapons contexts.
NIST AI RMF in Practice
The NIST AI Risk Management Framework (AI RMF) is the practical governance framework most federal programs use. It has four functions: GOVERN, MAP, MEASURE, and MANAGE.
| Function | What it requires | Data scientist deliverable |
|---|---|---|
| GOVERN | Organizational AI risk policies and oversight structures | Understand your program's RAI review process and approver |
| MAP | Identify and categorize AI risks for your specific use case | Written risk identification: accuracy, fairness, misuse, data freshness |
| MEASURE | Quantify identified risks with metrics and thresholds | AUC, demographic parity ratio, FPR by subgroup — all with numbers |
| MANAGE | Implement mitigations; monitor ongoing risk | Monitoring owner, retraining trigger, response plan for flagged metrics |
Bias Testing: Code, Not Intent
Bias analysis must produce numbers. "Bias was considered" is not an analysis. Run the demographic parity and equalized odds checks before any model goes to a program manager, not after.
import pandas as pd
import numpy as np
from sklearn.metrics import confusion_matrix
def bias_audit_report(
y_true: np.ndarray,
y_pred: np.ndarray,
protected_col: pd.Series,
threshold: float = 0.5,
reference_group: str = None,
) -> pd.DataFrame:
"""
Compute demographic parity and equalized odds metrics.
Returns a DataFrame with per-group statistics.
The dp_ratio < 0.80 threshold is the EEOC 4/5ths rule.
"""
groups = protected_col.unique()
results = []
for group in groups:
mask = (protected_col == group).values
n = mask.sum()
if n < 20:
continue
tn, fp, fn, tp = confusion_matrix(y_true[mask], y_pred[mask]).ravel()
positive_rate = (tp + fp) / n
tpr = tp / (tp + fn) if (tp + fn) > 0 else 0.0
fpr = fp / (fp + tn) if (fp + tn) > 0 else 0.0
results.append({
"group": group,
"n": n,
"positive_rate": round(float(positive_rate), 3),
"tpr": round(float(tpr), 3),
"fpr": round(float(fpr), 3),
})
result = pd.DataFrame(results).sort_values("positive_rate")
if reference_group is None:
reference_group = result.iloc[-1]["group"]
ref_row = result[result["group"] == reference_group].iloc[0]
result["dp_ratio"] = (result["positive_rate"] / ref_row["positive_rate"]).round(3)
result["fpr_ratio"] = (result["fpr"] / ref_row["fpr"]).round(3)
print(f"\nBias Audit | Threshold: {threshold} | Reference: {reference_group}")
print("=" * 70)
print(result.to_string(index=False))
flagged = result[result["dp_ratio"] < 0.80]
if len(flagged) > 0:
print(f"\nPOTENTIAL DISPARATE IMPACT (dp_ratio < 0.80):")
for _, row in flagged.iterrows():
print(f" Group '{row['group']}': dp_ratio={row['dp_ratio']:.3f}")
print(f"\nAction required: Review features, adjust threshold,")
print(f"or apply post-processing fairness constraint before deployment.")
else:
print(f"\nNo groups below 4/5ths threshold.")
return result
Proxy Discrimination Scan
The bias analysis above catches direct demographic disparities. It does not catch proxy discrimination — features highly correlated with protected characteristics without directly using them. Common proxies in government datasets: ZIP code/installation correlates with race; prior disciplinary record may reflect biased command culture; MOS/AFSC specialty correlates with gender due to historical assignment patterns.
from scipy.stats import pointbiserialr, chi2_contingency
import itertools
def proxy_correlation_scan(
X: pd.DataFrame,
protected_cols: list,
feature_cols: list,
correlation_threshold: float = 0.15,
) -> pd.DataFrame:
"""
Identify features with potentially concerning correlations to protected attributes.
Flagged features require documentation and monitoring — not automatic exclusion.
"""
results = []
for feat_col, prot_col in itertools.product(feature_cols, protected_cols):
if X[feat_col].nunique() < 2 or X[prot_col].nunique() < 2:
continue
if X[feat_col].dtype in [np.float64, np.float32, np.int64, np.int32]:
if X[prot_col].nunique() == 2:
corr, _ = pointbiserialr(
X[prot_col].astype(float), X[feat_col].astype(float)
)
abs_corr = abs(corr)
else:
from sklearn.feature_selection import f_classif
f_stat, _ = f_classif(X[[feat_col]].fillna(0),
X[prot_col].astype("category").cat.codes)
abs_corr = min(float(f_stat[0]) / (float(f_stat[0]) + len(X)), 1.0)
else:
contingency = pd.crosstab(X[feat_col], X[prot_col])
chi2, _, _, _ = chi2_contingency(contingency)
n = contingency.sum().sum()
k = min(contingency.shape) - 1
abs_corr = float(np.sqrt(chi2 / (n * k))) if (n * k) > 0 else 0.0
if abs_corr >= correlation_threshold:
results.append({
"feature": feat_col,
"protected_attribute": prot_col,
"correlation": round(abs_corr, 4),
})
report = pd.DataFrame(results).sort_values("correlation", ascending=False)
if len(report) > 0:
print(f"Proxy correlation scan — {len(report)} feature-attribute pairs above threshold:")
print(report.to_string(index=False))
return report
Writing Model Cards
A model card tells anyone who inherits your deployed model what it does, how it was built, what it should and shouldn't be used for, and what its known failure modes are. Federal programs are increasingly requiring model cards under various names: AI system documentation, algorithm impact assessment, responsible AI documentation.
A minimal federal model card covers nine items:
- Model description — What does it predict? What is the output format?
- Intended use — Which populations, contexts, and decisions is it approved for?
- Out-of-scope uses — What should it NOT be used for? As important as intended use.
- Training data — Source, time period, geographic scope, known gaps, data quality tier.
- Evaluation data — How does the test set differ from training? Is it a temporal hold-out?
- Performance metrics — Overall AND stratified by the operationally relevant subgroups.
- Bias and fairness analysis — Results of demographic parity and equalized odds checks with numbers.
- Limitations and risks — What conditions cause the model to underperform?
- Monitoring plan — Who is responsible? How often is it reviewed? What triggers a retrain?
Data Governance on Platforms
Unity Catalog on Databricks
Unity Catalog provides the technical infrastructure for compliance: column-level security (PII fields masked), row-level security (enforced at query time), data lineage (which tables produced which outputs), and audit logs (every access logged to queryable tables). Use Unity Catalog lineage APIs to document training data provenance in your model card.
from databricks.sdk import WorkspaceClient
from databricks.sdk.service.catalog import LineageDirection
w = WorkspaceClient()
def get_table_lineage(table_full_name: str) -> dict:
"""
Retrieve upstream lineage for a Delta table using Unity Catalog API.
Use this to document training data provenance in a model card or RAI assessment.
"""
lineage = w.lineage_tracking.table_lineage(
table_name=table_full_name,
direction=LineageDirection.UPSTREAM,
)
upstream_tables = []
written_by = []
for node in lineage.upstreams or []:
if hasattr(node, "table_info") and node.table_info:
upstream_tables.append(node.table_info.full_name)
if hasattr(node, "notebook_info") and node.notebook_info:
written_by.append({"type": "notebook", "path": node.notebook_info.path})
result = {"table": table_full_name, "upstream_tables": upstream_tables,
"written_by": written_by}
print(f"Lineage for {table_full_name}:")
for t in upstream_tables:
print(f" Upstream: {t}")
return result
# Document training data provenance for model card
lineage = get_table_lineage("jupiter_catalog.silver.maintenance_features")
Where This Goes Wrong
Failure Mode 1: Ethics as a Checklist at the End
Bias testing scheduled for "Sprint 6" while the model is already being shown to stakeholders in Sprint 4. Fix: run the proxy correlation scan before feature selection. Run the preliminary bias audit before model selection. It takes four hours to do a basic bias audit. Do it while the model is still malleable, not after it has been briefed to the program manager as "done."
Failure Mode 2: Documenting Intent Instead of Results
The model card says "bias was considered" without quantitative results. The RAI documentation uses words like "efforts were made." Fix: the number is the document. "The model's FPR for Group A is 12.3% vs. 8.1% for Group B — a ratio of 1.52. This was reviewed during RAI assessment and determined within acceptable range because [specific reason]. It will be monitored monthly." That is a real document.
Failure Mode 3: One-Time Compliance vs. Ongoing Governance
The RAI assessment was approved 18 months ago. The model hasn't changed. Therefore it is still approved. The monitoring job hasn't run in 90 days, training data is 24 months old, and the operational context has changed. Fix: set a calendar reminder for 12 months after approval. Re-run bias metrics, check monitoring data, determine whether re-review is required. The RAI approval is for the model at a specific point in time on specific data in a specific operational context.
Pre-Deployment Ethics Checklist
- Training data source documented with table names, time period, and row count
- Data lineage captured in Unity Catalog or Foundry catalog
- PII columns masked or excluded from training features
- Protected characteristics relevant to the use case identified
- Proxy correlation scan run against all training features
- Demographic parity ratio computed for all protected groups
- Equalized odds check (TPR and FPR parity) completed
- Bias findings documented with quantitative results (not intent statements)
- Model card completed with all 9 required sections
- Monitoring owner is a named person, not a team or office
- Retraining trigger is a specific metric threshold, not "as needed"
- NIST AI RMF MAP, MEASURE, and MANAGE steps documented
Exercises
This chapter includes 5 hands-on exercises with full solutions — coding challenges, analysis tasks, and scenario-based problems.
View Exercises on GitHub →