You're seeing this page as if you were . The main menu is still yours, though. Exit from immersion
Derek LinzDL

Derek Linz

-Sr. Staff Escalation & Reliability Engineer

€1,200/day
Rotterdam, NL
8-15 years

Average response time: 1 hour

About Derek

I specialize in the hard problems — the ones that survive first-line support and require deep investigation across the hardware/software boundary. With 10+ years in datacenter-scale environments, I've spent my career at Nutanix diagnosing critical failures where compute, storage, networking, and hypervisor layers intersect.
My focus areas include GPU-accelerated infrastructure (NVIDIA driver debugging, VFIO/mdev passthrough, ECC analysis), Linux internals (kernel crash analysis, kdump, driver fault tracing), KVM/AHV virtualization, and NUMA/performance regression work on clustered systems. I've built diagnostic tooling used thousands of times across global customer fleets and worked directly with NVIDIA engineering to trace and resolve driver-level regressions.
I'm available for short or long-term engagements involving escalation support, infrastructure reliability investigations, diagnostic tooling, or performance analysis on complex Linux/GPU environments.
Certifications: RHCSA · Nutanix Certified Master (NCM MCI) · VCP6-DCV/NV
  • English

    Native or bilingual

Can work on-site
Rotterdam (up to 50km)

Experience

  • Nutanix
    -Sr. Staff Escalation & Systems Reliability Engineer
    TECH
    January 2019 - July 2025 (6 years and 6 months)
    Amsterdam, Netherlands
    • - Performed kernel-level crash-dump analysis on production clusters, isolating failures within NVIDIA's closed-source GPU driver modules; identified a driver regression that triggered firmware-level timeout conditions under specific workloads, and collaborated with Nutanix GPU Engineering and NVIDIA to validate fixes using pre-release driver builds
    • - Troubleshot GPU passthrough (VFIO) and mediated-device (mdev) issues on AHV/KVM, including driver-binding problems, incomplete GPU reset behavior (FLR-related), and mdev provisioning failures due to host/guest driver mismatches
    • - Designed and maintained automated live-boot ISO images embedding diagnostic payloads for GPU and storage nodes; scripts captured telemetry, logs, and performance signatures automatically on boot, reducing triage time from hours to minutes across global customer fleets.
    • - Developed SQL correlation queries on large telemetry datasets to detect configuration dependent failure signatures across customer environments; integrated results into automated workflows that surfaced high-impact issues for engineering and support.
    • - Investigated NUMA locality challenges in Nutanix AHV clusters where Controller VMs are pinned to host cores; validated BIOS configurations to maximize local I/O performance and avoid remote-node latency penalties.
    • - Ran performance benchmarking using Cinebench and Phoronix Test Suite analyzing Nutanix hardware platforms and guest performance across hypervisors (AHV/ESXi), kernel versions, and CPU architectures to identify configuration-dependent regressions.
    • - Created Python and Bash automation used to orchestrate log capture, correlate kernel events, and produce actionable diagnostic summaries for critical customer escalations.
    VMWARE KVM Linux Data visualization Python
  • Nutanix
    Systems Reliability Engineer
    TECH
    January 2016 - January 2019 (3 years)
    Amsterdam, Netherlands
    • - Troubleshot cross-layer failures across compute, storage, and networking paths in distributed Nutanix clusters.
    • - Provisioned and managed server hardware in the Support Lab; performed node bring-up, imaging, racking, and platform validation.
    • - Supported engineering teams by validating experimental configurations and identifying systemic reliability issues.
    • - Work directly supported production infrastructure operating at global datacenter scale.
    System administration VMWARE Linux KVM Nutanix
  • VCE
    vPlatform Support Engineer
    TECH
    June 2015 - March 2016 (9 months)
    Durham, United States
    Provided escalation support for VCE Vblock converged infrastructure, resolving complex customer issues across compute, networking, and storage layers. Authored and presented RCA documents for critical incidents, proactively identified at-risk customers based on problem trends, and mentored junior support engineers. Worked closely with critical accounts and emerging product lines.
    VMWARE Cisco

Recommendations

Be the first to recommend Derek

Help this freelancer shine by sharing your experience working together.

These freelancer profiles also match your criteria

AgathaA

Agatha Frydrych

Backend Java Software Engineer

4.7

(3)

2

BaptisteB

Baptiste Duhen

Fullstack developer

4.6

(4)

5

AmedA

Amed Hamou

Senior Lead Developer

4

(2)

7

AudreyA

Audrey Champion

Web developer

4.3

(3)

4

Education

  • NVIDIA AI Enterprise Admin
    NVIDIA AI Enterprise Admin
  • RHCSA
    RHCSA

Certifications

Skill set

Categories