Lead Mainframe Infrastructure Engineer
Job Description
As a Lead Infrastructure Engineer at JPMorganChase within Corporate Sector – Enterproise Technology, you are part of an agile team that works to enhance, design, and deliver the software components to the firm’s state-of-the-art technology products in a secure, stable, and scalable way. You will help engineer and run IBM MQ on z/OS within the IBM Z ecosystem, drive automation, and advance AIOps/operational analytics to improve stability, resiliency, and recovery. We value an AI mindset and curiosity—bringing an automation-first approach, continuously learning, and applying AI responsibly to reduce toil and improve outcomes.
Job Responsibilities
Provide global on-call support as part of a shared rotation; lead structured incident/problem resolution and continuous improvements.
Design, develop, and deploy changes while following firm processes for change management, issue resolution, design governance, and JIRA.
Engineer, secure, and optimize IBM MQ on z/OS (queue managers, queues, channels, clusters, QSG, logging, HA across LPARs/sysplex); assess upstream/downstream impacts and mitigation actions.
Provide MQ administration support to multiple applications; enable load balancing/failover via clustering and queue sharing groups.
Build and integrate telemetry from MQ and platform sources (e.g., queue depth, put/get rates, channel states, DLQ events, plus SMF/RMF and logs) into enterprise observability/SRE workflows.
Deliver AIOps and automation: anomaly detection, capacity forecasting, intelligent alerting/noise reduction, and codified runbooks for repeatable operations and remediation.
Demonstrate an AI mindset and curiosity by identifying opportunities to improve operational workflows, modernize runbooks, and safely experiment with automation/analytics to prevent incidents and reduce MTTR.
Use modern IBM Z capabilities where appropriate: z/OSMF workflows & REST APIs, monitoring (e.g., MAINVIEW or equivalent), Common Data Provider/SMF streaming, z/OS Connect EE, and USS-based (Python/REXX) automation.
Support cryptography inventory and PQC readiness enhancements for MQ/z/OS flows (TLS configurations, cipher/key usage, dependencies, key rotation/exception tracking) and integrate measurable coverage into operational reporting.
Produce runbooks, architecture documentation, and audit-ready evidence; partner with cybersecurity, risk/control, SRE, and platform owners; align with Responsible AI and model risk expectations where ML is used.
Required Qualifications, Capabilities, and Skills
Formal training/certification in Infrastructure Engineering concepts and 5+ years applied experience (or equivalent).
Strong experience on IBM zSeries / IBM Z with z/OS fundamentals (e.g., sysplex concepts, WLM, JES2/3, USS, SAF/RACF, SMF dataset management).
Understanding of MQ concepts: queue managers, queues, channels, and MQ object administration.
Understanding of RACF/ACF2 security, least privilege, and secure change practices.
Strong critical thinking, problem-solving, and communication skills; ability to collaborate across roles and teams.
Demonstrated curiosity, continuous learning, and comfort working across infrastructure, automation, and analytics domains.
Preferred Qualifications, Skills, and Capabilities:
MQ systems programming depth: installation/maintenance/implementation on z/OS, clusters/QSG, tuning using SMF records and monitoring data.
Data/AI engineering for ops telemetry: Python, time-series/log shaping, schema governance; streaming/batch pipelines using firm-approved platforms (e.g., Kafka, Spark, enterprise observability).
Automation: REXX, CLIST, JCL, Ansible, z/OSMF/Zowe workflows; CI/CD and controlled release in regulated environments.
Familiarity with CICS, DB2, IMS, and cross-platform MQ integrations.
Experience productionizing ML (lifecycle, monitoring/drift, explainability, rollback) and applying Responsible AI principles.
IBM MQ on z/OS experience preferred, but not mandatory if you bring strong adjacent experience (IBM Systems, messaging, observability/AIOps, automation) and willingness to learn MQ rapidly.


