WORKLOAD AUTOMATION COMMUNITY
  • Home
  • Blogs
  • Forum
  • Resources
  • Events
    • IWA 9.5 Roadshows
  • About
  • Contact
  • What's new

Test your Workload Automation detective skills: Whodunnit?

1/24/2019

1 Comment

 
Picture
What if an ordinary day with Workload Automation turns into an intriguing mystery clue game? How can you find out what went wrong? Which is the weapon, and in which room did the crime scene occur in? Let’s put on the detective hat and start the game!
​
Everything was going as scheduled, all the workstations in your environment were up and running, the entire workload was running happily and smoothly, all the deadlines were going to be met. You were just starting to enjoy your coffee and looking at the greenest dashboard you have ever seen… And then, it happened. AGAIN. A job appears in your dashboard to be at a potential risk! 
Picture
Stay cool! The clock is ticking but you need to keep calm and focused to solve this mystery crime case and save the entire workload (and your day). 
 
Let’s start looking for clues…With a slightly shaking hand you click on the “potential risk” bar to take a step into the crime scene.  

Here it is: your most important job is at risk… but who is responsible for that? 

​
You launch the what-if-analysis and highlight the “critical path” to find out who is impacting your precious deadline. 

And then you find it! Here is Mister Job, at the beginning of the chain, blocking your entire workload… 
Picture
Now it is time to hold back your tears and start investigating with the right questions. 
Who is blocking your job? Who is the culprit? 
 
Agent down 
We know, the first to blame is always the butler. So, you first click goes for the workstation.  
Same old story: you are expecting to find your agent not running (or your agent without the J flag in case you are checking it from the conman sc command). 
Picture
This happens when the server sets the agent down because it has received no heartbeat from your agent. You need to check for a message like the following one in the WebSphere Application Server log room: 
WebSphere Application Server log room

    
Note that those messages will be found in the SystemOut.log in the WebSphere Application Server log path (opt/IBM/TWA/WAS/TWSProfile/logs/server1) for releases previous to 9.5 or  
or in the message.log file in the Liberty log path: (/opt/wa/server_root/TWSDATA/stdlist/appserver/engineServer/logs) starting from 9.5. 
 
If this happens, a StartUpLwa command on your agent will be enough to restore it. But … this is not your case. Your agent is running…Then, why is your job stuck? 
Picture
Job in INTRO 
This time you try with the Monitor jobs (or with conman sj if you prefer the command line). For sure, in this way you will understand what’s going on… And in fact, there it is Mister Job laying down in INTRO status without going READY. 
 
You know, now you need more clues… but where to find them?  

You decide to proceed with a systematic approach to avoid to get lost forever in the twist and turns of Workload Automation Mansion. 
 
The first room you enter is the WebSphere Application Server log room to find out if the broker and the agent are talking each other. 
 
Mr. Broker has already sent your Mr. Job to Mr. Agent… but no response has been received from Mr. Agent. Is he responding or not? Who is blocking your messages and trying to mislead your investigation? 
 
You will need to move to the agent log room to find these answers! 

​You enter the stdlist/JM/jobmessage.log and suddenly find out that the agent is unable to send resources to the broker: 
Agent

    
These are the evidences you were looking for, now you can accuse Mr. Agent! Otherwise you would have found a message like the following one: ​
Agent

    
Still no signs of the crime scene weapons. 
Picture
You write down a quick list of all the possible weapons that are preventing Mr. Agent to communicate with Mr. Broker.  

Is the DNS preventing your agent to resolve the broker address? If yes, you need to check and fix the hosts file. 

Is the firewall blocking your messages? If yes, you need to ask your network admin to open the port
. 

Is the agent trying to reach the broker on a port different from the default one (31116)? The jobmanager.ini configuration needs to be fixed in this case. 
​

Is the CIT just sending resource status updates while the server is waiting for the full resource status? In this case, restarting the agent will solve the case and save your day! 
 
Job in READY 
Let’s picture another case with a completely different investigation scenario. Going back to the Monitor job, what do you need to look at if Mr. Job is READY but not starting? Multiple weapons are possible in this case! Usually, it’s a matter of dependencies: is Mr. Job waiting for a prompt, a resource or a time dependency? Or there is a job or job stream dependency that is preventing Mr. Job to start?  But this is not the only option: it can be a limit or fence fault! In those cases, going to the Job Stream View and asking “why a job does not start?” can help your investigation. 
Picture
Job in FAIL 
What to check instead if Mr. Job went directly to FAIL status just after being in EXEC status? In this case the weapon is easy to spot: the job failed to start because of the wrong user. You need to go to the Workload Designer room to fix the job definition! 
 
The final accusation 
Regardless to the route you have followed so far, you can be proud of yourself: you MADE IT!  
Sherlock Holmes would be proud of you. You stayed calm and saved Workload Automation by investigating the crime case. Like all respectable superheroes, you moved unnoticed until you made the perfect matching between your hypothesis and the available data.  You can relax a little bit…but be ready to play another mystery clue game soon! 
Picture
Alessandro Tomasi 
Senior Technical Lead 
Connect with me on
LinkedIn  
 
Alessandro is a Software Engineer. He has an academic background in Computer Engineering, in particular in technology customer support and automation testing. Alessandro wrote several patents and publications. He loves tech, sports, and has been member of the robocup during his schooling. He loves snowboarding and did several spartan races. Alessandro is currently based in the HCL Products and Platforms in Rome for the software development laboratory. 
 
Picture
Michela Fortuna 
UX Designer 
Connect with me on Linkedin

Michela Fortuna is a UX Designer in the Workload Automation design team. She works in improving and optimising user experience, as well in designing interfaces and interactive prototypes. She’s passionate about technology, visual design, and innovation.  Her academic studies include a BA in Communication and a MS in Experience Design. 
 
 
Picture
Enrica Pesare  
UX Designer  
Connect with me on
  LinkedIn 

Enrica is a UX Designer for the HCL Products and Platforms software development laboratory located in Rome, Italy. She is a Software Engineer in Quality Assurance for Workload Automation and is also the owner of the Workload Automation production environment. Currently, Enrica is part of the  UX  design team for Workload Automation. 
1 Comment
livecareer resume builder reviews link
2/17/2019 02:44:24 am

My workload is actually killing me right now. I am someone who you can classify as career oriented. Like, I am really passionate about the things I do. I really want to be promoted this month and because of that, I took up more workload. Quite frankly, I have no right to complain about it, but I just really want this to end already. I have missed a lot of things because of my busy schedule and it really is not fun.

Reply

Your comment will be posted after it is approved.


Leave a Reply.

    Archives

    May 2022
    April 2022
    March 2022
    February 2022
    January 2022
    December 2021
    October 2021
    September 2021
    August 2021
    July 2021
    June 2021
    May 2021
    April 2021
    March 2021
    February 2021
    January 2021
    December 2020
    November 2020
    October 2020
    September 2020
    August 2020
    July 2020
    June 2020
    May 2020
    April 2020
    March 2020
    January 2020
    December 2019
    November 2019
    October 2019
    August 2019
    July 2019
    June 2019
    May 2019
    April 2019
    March 2019
    February 2019
    January 2019
    December 2018
    November 2018
    October 2018
    September 2018
    August 2018
    July 2018
    June 2018
    May 2018
    April 2018
    March 2018
    February 2018
    January 2018
    December 2017
    November 2017
    October 2017
    September 2017
    August 2017
    July 2017
    June 2017
    May 2017

    Categories

    All
    Analytics
    Azure
    Business Applications
    Cloud
    Data Storage
    DevOps
    Monitoring & Reporting

    RSS Feed

www.hcltechsw.com
About HCL Software 
HCL Software is a division of HCL Technologies (HCL) that operates its primary software business. It develops, markets, sells, and supports over 20 product families in the areas of DevSecOps, Automation, Digital Solutions, Data Management, Marketing and Commerce, and Mainframes. HCL Software has offices and labs around the world to serve thousands of customers. Its mission is to drive ultimate customer success with their IT investments through relentless innovation of its products. For more information, To know more  please visit www.hcltechsw.com.  Copyright © 2019 HCL Technologies Limited
  • Home
  • Blogs
  • Forum
  • Resources
  • Events
    • IWA 9.5 Roadshows
  • About
  • Contact
  • What's new