sixtydoses. where od is harmless.

June 18, 2010

I’m a Slacker.

Filed under: Tech — Tags: , , — od @ 6:10 am

No, no.. not that slacker. But Slacker.

I’ve been using Slackware for about a month or so. Haven’t really got the chance to explore it intensively yet, but so far I’m loving it. The installation process was a breeze and fast(it reminded me of sysinstall). I actually reinstalled three times – due to the lack of skills in partition sizing. The notebook that my company provided only carries an 80Gb size of hard disk, so I had to be very stingy when it comes to allocating spaces.

When I first wanted to switch from Ubuntu which I have been using for about 3 years, I was torn apart between choosing either Arch or Slack. I decided to go with Slack because of some success stories of users installing Lotus Notes that I’ve read. Well to be honest there are similar success stories of users installing Lotus Notes on Arch, but.. I must say that I’m pretty much heavily influenced by other FreeBSD forumers.

Installing Lotus Notes was a bit tricky, but once I got it installed, it runs pretty well. Crashed a few times, rarely though, but hey, this is Lotus Notes am talking about here. It could possibly crash on any possible platform you could think of. Lol. I seriously hate Lotus Notes. Argh. It’s huge, bloated and crippling slow. I wish IBM could produce a light weight version of Lotus Notes as an option for those who have to ‘painfully’ use it to check email. And not marking calendars, browse the net, play with widgets and what not. Gah. Apart from Lotus Notes, I’ve installed WebLogic and it runs smoothly on Slackware, no surprise there. I haven’t tried WebSphere though. Hopefully it’ll run fine as well. I definitely need one running on my notebook else don’t call me a WebSphere technical support.

There’s one thing that irks me though.. the netconfig command. It is the command you use to set an IP address to your machine, but I can’t for the life of me understand why it has to ask me to key in the hostname everytime I want to change my IP. How many times do I change my IP address? Being a technical support, a lot. How many times do I change a hostname of my machine? I don’t know. Like.. never? Well I know I’m complaining here, so.. am just gonna STFU and modify the netconfig script to stop prompting me for a hostname and FQDN. I’ve also added into the script 2 locations that I frequent the most, which is my office and home. I know I could just run a one liner command to change my IP, but that would be dynamic. And I prefer to have the IP to remain upon reboot.

Another thing that I still haven’t gotten the grip yet is the installation of 3rd party packages. All of the packages on my Slackware were installed by manually fetching the packages from slackbuild website, and get them installed using installpkg. I came across of a guy saying this in a forum.. “True slackers compile from the source”. Meh. So obviously I’m not a true slacker.. But anyhoo, that’s not the point. So apart from using installpkg, removepkg or compiling a 3rd party app from the source, I could also opt for slackpkg, the automated tool for managing Slackware packages. Hrmm. But I can only specify one mirror.. at first I thought it is like apt-get, where you can define repositories and just run apt-get install <package> and be done with it. But apparently there are not many packages that I could install using slackpkg. Most of the time when I run slackpkg search <package>, I got a message stating that there is no keyword match for the package that I wish to install. I really don’t mind compiling, or installing the binaries, but I hate to have to go to websites and fetch the packages manually. I think I must’ve missed something, there must be a better way that true slackers are practicing that I have yet to discover.

With all that said, I love the current setup of my notebook with Slackware. Everything is snappy.

So Ray, let me hear you roar.

March 5, 2010

Either Oracle Smart Update utility interface sucks or I’m an artard.

Filed under: Tech — Tags: , , — od @ 6:43 pm

I just couldn’t figure out how to get Smart Update to work offline. I don’t have the authorization to download all the patches, so I got them from some Oracle support guy, fire up bsu.sh, select work offline, and spend several hours trying to figure out how to point to my patches directory. Bleargh. Fortunately, CLI works like a charm.


od@sysh:/opt/bea103/utils/bsu$ ./bsu.sh -install -patchlist=79YU,CJ4W,ETR7,IQXV,SYCB,T552 -patch_download_dir=/home/od/Desktop/allpatcheszip -prod_dir=/opt/bea103/wlserver_10.3
Checking for conflicts..
No conflict(s) detected

Installing Patch ID: 79YU..
Result: Success

Installing Patch ID: CJ4W.
Result: Success

Installing Patch ID: ETR7.
Result: Success

Installing Patch ID: IQXV.
Result: Success

Installing Patch ID: SYCB.
Result: Success

Installing Patch ID: T552.
Result: Success



On a different note, while I was installing WebLogic Server 10.3.x in silent mode, I was prompted with this error:


od@sysh:~/Desktop$ ./server103_linux32.bin -mode=silent -silent_xml=silent_103.xml
Extracting 0%……………………………………………………………………………………….100%
The local BEA product registry is corrupted. Please select another BEA Home or contact BEA Support
** Error during execution, error code = 65280.



Found a forum that says, for version above 9.x, ‘COMPONENT_PATHS’ doesn’t accept value like this –> “WebLogic Server/Core Application Server” anymore. But actually it does accept “WebLogic Server/<insert_component>” format, and even the documentation says so.

So anyway, the error was due to my habit of copying and pasting in vi which sort of corrupted my component paths line. So yea, if you get that kind of error, I’d say chance is your silent.xml file format is incorrect.

These are my silent.xml files.



This will install  all WebLogic Server components.

###################################################

<?xml version=”1.0″ encoding=”UTF-8″?>
<!– Silent installer option: -mode=silent -silent_xml=C:\bea\silent.xml –>

<bea-installer>
<input-fields>
<data-value name=”BEAHOME” value=”/opt/bea1032″ />
<data-value name=”WLS_INSTALL_DIR” value=”/opt/bea1032/wlserver_10.3″ />
<data-value name=”COMPONENT_PATHS” value=”WebLogic Server” />
<data-value name=”INSTALL_NODE_MANAGER_SERVICE” value=”yes” />
<data-value name=”NODEMGR_PORT” value=”5559″ />
<data-value name=”INSTALL_SHORTCUT_IN_ALL_USERS_FOLDER” value=”yes”/>

</input-fields>
</bea-installer>

###################################################



This will install all WebLogic Server components, without the samples.

###################################################

<?xml version=”1.0″ encoding=”UTF-8″?>
<!– Silent installer option: -mode=silent -silent_xml=C:\bea\silent.xml –>

<bea-installer>
<input-fields>
<data-value name=”BEAHOME” value=”D:\bea\wls103_silent” />
<data-value name=”WLS_INSTALL_DIR” value=”D:\bea\wls103_silent\wlserver_10.3″ />
<data-value name=”WLW_INSTALL_DIR” value=”D:\bea\wls103_silent\workshop_10.3″ />
<data-value name=”COMPONENT_PATHS” value=”WebLogic Server/Core Application Server|WebLogic Server/Administration Console|WebLogic Server/Configuration Wizard and Upgrade Framework|WebLogic Server/Web 2.0 HTTP Pub-Sub Server|WebLogic Server/WebLogic JDBC Drivers|WebLogic Server/Third Party JDBC Drivers|WebLogic Server/WebLogic Server Clients|WebLogic Server/WebLogic Web Server Plugins|WebLogic Server/UDDI and Xquery Support|WebLogic Server/Server Examples|Workshop/Workshop for WebLogic|Workshop/Workshop Runtime Framework” />
<data-value name=”USE_EXTERNAL_ECLIPSE” value=”false” />
<data-value name=”EXTERNAL_ECLIPSE_DIR” value=”D:\eclipse332\eclipse” />
<data-value name=”INSTALL_NODE_MANAGER_SERVICE” value=”yes” />
<data-value name=”NODEMGR_PORT” value=”5559″ />
<data-value name=”INSTALL_SHORTCUT_IN_ALL_USERS_FOLDER” value=”yes”/>

</input-fields>
</bea-installer>

###################################################


What caused the error was, I simply copied the component paths from my terminal, and paste it using the vi editor, so I kinda missed the fact that the line was not continuous, creating breaks of white space. Something like this:

<data-value name=”COMPONENT_PATHS” value=”WebLogic Server/Core Application Server|WebLogic Server/Administration Console|WebLogic Server/Configuration Wizard and Upgrade Framework|WebLogic Server/Web 2.0 <break of white space>
HTTP Pub-Sub Server WebLogic Server/WebLogic JDBC Drivers|WebLogic Server/Third Party JDBC <break of white space>
Drivers|WebLogic Server/WebLogic Server Clients|WebLogic Server/WebLogic Web Server Plugins|WebLogic Server/UDDI and Xquery Support|WebLogic Server/Server Examples|Workshop/Workshop for <break of white space>
WebLogic|Workshop/Workshop Runtime Framework” />

January 22, 2010

Autostart tomcat upon reboot.

Filed under: Tech — Tags: , , , , — od @ 1:04 pm

So this morning they shutdown the server and called me up complaining that the website is down.

No, I didn’t ask how many times have they rebooted. Lol.

Anyway the website is down because I didn’t configure both apache and tomcat to run automatically upon reboot. Am so lazy today because it’s Friday, basically it’s a yippee day,  a day that is legal for you to come to work late, and go back early.

Googled for the auto script, but none satisfied my needs, so, here’s mine (adapted from a couple of scripts), because sharing is caring.

This script will always run tomcat as user ‘admin’ (EUID 500). If you run the script as a different user, it’ll prompt for admin’s password. Dump the script in /etc/init.d/ and run chkconfig to configure runlevel startup.





#!/bin/bash
#
# tomcat     This is the init.d script used to start tomcat.
#                It calls $CATALINA_HOME/bin/startup.sh or shutdown.sh
# chkconfig: – 91 15
# description: Apache Tomcat is an open source software implementation of the Java Servlet and JavaServer Pages technologies.
# processname: tomcat

export JAVA_HOME=/usr/java/jdk1.6.0_16
export CATALINA_HOME=/usr/local/apache-tomcat-5.5.28

tomcat_stop() {
if [[ $EUID -ne 500 ]]; then
su -c ‘$CATALINA_HOME/bin/shutdown.sh’ admin
exit 1
else
$CATALINA_HOME/bin/shutdown.sh
fi
}

tomcat_start() {
if [[ $EUID -ne 500 ]]; then
su -c ‘$CATALINA_HOME/bin/startup.sh’ admin
exit 1
else
$CATALINA_HOME/bin/startup.sh
fi
}

case $1 in
start)
echo -n “Starting Tomcat server:”
tomcat_start
echo “.”
;;
stop)
echo -n “Stopping Tomcat server:”
tomcat_stop
echo “.”
;;
*)
echo “Usage: /etc/init.d/tomcat start|stop”
;;
esac

December 31, 2009

It’s the last day of 2009 and I just want to write – new year, new theme.

Filed under: Tech — Tags: , , — od @ 1:58 pm

Remember I ranted about how difficult it is to find a wallpaper without having a specific theme in mind?

Well, turned out I found something better than just a wallpaper – a new theme for my xfce. I’ve been switching between xfce default theme and Clearlooks theme for a couple of years simply because it was so hard for me to find a clean theme with the right color, right tone and the right style.

I just found this awesome theme a couple of days ago, and I couldn’t be happier. While I don’t use Google Chrome because it’s not ported in FreeBSD and I’m back with Opera (did anyone notice that Opera is getting sexier by each version?), I am surely in love with its look. It’s clean, neat and simple.

Anyway, presenting the current xfce theme that I’m using, one of the best I’ve come across, the Chrome-like theme. I found another similar theme at gnome-look.org, but I like this one better because of its soothing color and simplicity. Minimal, just the way I like it.

I matched the theme with Chameleon Xcursors X11(Chameleon sky blue small) mouse theme, and it’s brilliant. I love it!

Now I can stop thinking about beautifying my desktop for a while..

Last day of 2009.

Filed under: Life, Tech — Tags: — od @ 5:06 am

A bit old article, but an interesting reading.

2009 has been kind to me, sort of.

December 16, 2009

Drive failure expected in less than 24 hours. SAVE ALL DATA.

Filed under: Tech — Tags: , , — od @ 3:00 pm

And so I’ve spent days and days, prolly weeks, to get rid some of the problems I faced in my FreeBSD. Currently running on version 8.0-RELEASE, I’ve upgraded almost all of my installed ports, and get my usb stick to work with the new usb stack. Everything works wonderful, stable and I’m a happy user.

Happy eh?

At least I thought so.

Today it crossed my mind to check how my hard disks are doing and to my surprise, one of the report came back with a failure alert:

# smartctl -d ata -A /dev/ad4
smartctl version 5.38 [amd64-portbld-freebsd8.0] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate     0x000f   200   200   051    Pre-fail  Always       –       0
3 Spin_Up_Time            0x0003   165   164   021    Pre-fail  Always       –       4741
4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       –       211
5 Reallocated_Sector_Ct   0x0033   065   065   140    Pre-fail  Always   FAILING_NOW 1080
7 Seek_Error_Rate         0x000e   200   197   051    Old_age   Always       –       0
9 Power_On_Hours          0x0032   085   085   000    Old_age   Always       –       11344
10 Spin_Retry_Count        0x0012   100   100   051    Old_age   Always       –       0
11 Calibration_Retry_Count 0x0012   100   100   051    Old_age   Always       –       0
12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       –       209
192 Power-Off_Retract_Count 0x0032   193   193   000    Old_age   Always       –       5447
193 Load_Cycle_Count        0x0032   001   001   000    Old_age   Always       –       1227719
194 Temperature_Celsius     0x0022   097   092   000    Old_age   Always       –       50
196 Reallocated_Event_Count 0x0032   101   101   000    Old_age   Always       –       99
197 Current_Pending_Sector  0x0012   200   200   000    Old_age   Always       –       0
198 Offline_Uncorrectable   0x0010   200   200   000    Old_age   Offline      –       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       –       0
200 Multi_Zone_Error_Rate   0x0008   200   200   051    Old_age   Offline      –       0

And this makes my heart races even faster:

# smartctl -H /dev/ad4
smartctl version 5.38 [amd64-portbld-freebsd8.0] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: FAILED!
Drive failure expected in less than 24 hours. SAVE ALL DATA.
Failed Attributes:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
5 Reallocated_Sector_Ct   0x0033   065   065   140    Pre-fail  Always   FAILING_NOW 1080

Less than 24 hours? And I happened to be damn lucky to check on my disks today? Since day one I bought the hard disks, I’ve never really checked on it. Today I did, and I got this. Am seriously hoping that it’s just a false alarm (still am gonna do my backups).

And that is my around-2yo WD hard disk. My second WD hard disk. My first WD hard disk failed on me after running for less than a year. Jinx? Lol. Wait. This is not funny! Hmmmpphhhhhhhh..

Anyway, am too frustrated now. My hands are cold, my feet are cold, and all I could think of right now is doing a backup so that I could port my old system to a new hard disk. Which I have to buy..

Speaking of which.. another WD? Seriously I don’t know. Everyone is saying WD is the best, yet 2 of them died (well this one hasn’t died yet but already complaining) fast enough. Maxtor served me for a good 5 years or more, but the reputation isn’t that great either compared to other brands. Suggestion?

December 2, 2009

StatsView on Ubuntu Jaunty.

Filed under: Tech — Tags: , , , — od @ 1:37 am

Quoted from StatsView README:

#####################################################

PREREQUISITES
————-

StatsView is written in Perl5, using the Perl Tk extension library. I recommend
that you use perl5.005_03 or later, and Tk800.014 or later, as StatsView has
been tested with these versions. You will also need the Tk::GBARR add-on
package for this version.

The graphing is done with gnuplot, and version 3.7 or later is required –
a copy can be found in the gnuplot_src subdirectory.

#####################################################

I already have Perl5 and gnuplot in my system, so I only need to install Perl Tk, Tk-GBARR and StatsView.

Install Perl Tk:

#apt-get install perl-tk


Grab Tk-GBARR from http://www.cpan.org/authors/id/SREZIC/. At this time of writing the latest version of Tk-GBARR is 2.08:

#fetch http://www.cpan.org/authors/id/SREZIC/Tk-GBARR-2.08.tar.gz
#tar xvfz Tk-GBARR-2.08.tar.gz
#cd Tk-GBARR-2.08
#perl Makefile.PL
#make
#make test
#make install


Grab StatsView from http://www.cpan.org/authors/id/ABURLISON/.

#fetch http://www.cpan.org/authors/id/ABURLISON/StatsView-1.4.tar.gz
#tar xvfz StatsView-1.4.tar.gz
#cd StatsView-1.4
#perl Makefile.PL
#make install
#./scripts/sv


Test StatsView using the example included:

#cd /path/StatsView-1.4/examples
#gzip -d sar.txt.gz
#../scripts/sv sar.txt


It worked wonderfully using the sar output sample.. but I couldn’t get it to display any graph using my collection of sar output, both in binary and text formats. At this point am still not sure why it kept on complaining that my output file is invalid.

They are valid AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAa!!! 😯

Now what am I supposed to tell my boss?

“Hi boss.. remember the Linux server performance data that I promised last week? I don’t have it.”
“How come?”
“I have it, but I don’t have it in a pwetty graph format like I pwomised.”
“Well it’s okay you can pass me the raw data.”
“It’s in binary.”
“What?”
“Uh.. you know there are 10 types of people in the world: Those who understand binary, and those who don’t…”
“You don’t have anything to present, do you?”
*GULP*

Source: http://jerkharris.com/books/books/PerlOrDBA/oracleperl-CHP-3-SECT-3.html

June 22, 2009

Installing Sun VirtualBox on FreeBSD 7.1/amd64.

Filed under: Tech — Tags: , , — od @ 12:41 am

Today I decided to install Sun VirtualBox on my FreeBSD 7.1/amd64. The installation from ports is straightforward as usual.

#csup -g -L2 /etc/ports-supfile
#cd /usr/ports/emulators/virtualbox
#make install clean

I got stuck when it failed to fetch this file Dev86src-0.16.17.tar.gz.

#wget ftp://ftp.freebsd.org/pub/FreeBSD/ports/distfiles/Dev86src-0.16.17.tar.gz
#mv Dev86src-0.16.17.tar.gz /usr/ports/distfiles/
#make install clean
#hash -r ; ldconfig

Mount proc:
#mount -t procfs proc /proc

Load vbox driver module:
#kldload vboxdrv.ko
#kldstat | grep box
13    1 0xffffffffab46a000 3fb0e    vboxdrv.ko

Launch vbox from menu or command:
#VirtualBox

Done.

June 20, 2009

Virtualbox is now available in ports!

Filed under: Tech — Tags: , — od @ 8:40 am

This could be one of the bestest news I’ve heard in months!!

Excerpt from miwi:

“Today Virtualbox was committed to the FreeBSD ports tree. After a lot of work we had a good discussion today about how stable Virtualbox is, and after the CTF with take6 we got a lot of good feedback, so it was time to commit.”

Read it here.

Woohoo!!

May 23, 2009

Edge Load Balancer Network Dispatcher – Double Collocated HA on HP-UX.

Filed under: Tech — Tags: , , , , — od @ 5:01 am

One of my recent project was to configure Edge load balancer on 2 servers in high availability (HA) environment. I rarely do Edge, but the configuration is pretty straightforward. In my past projects, Edge implementation has always been in separate boxes, which is easier compared to collocated setup. In this post I’m going to share my configuration for edge dispatcher (MAC forwarding) that resides together with web server (I’m using IHS) and WebSphere. Each server will use 1 IP address for both web server and dispatcher. The configuration is almost the same, but there were few issues that I encountered and I hope this post will be of help to those who are dealing with Edge dispatcher as well.

For typical setup of Edge load balancer servers that do not reside in the same box with web servers, the general rules are:
– Primary Edge – cluster IP aliased to its NIC.
– Standby Edge – cluster IP aliased to its loopback.
– Web Servers – cluster IP aliased to loopback.

These rules hold the same in collocated environment:
– Primary Edge – cluster IP aliased to its NIC.
– Standby Edge – cluster IP aliased to its loopback.

Collocated Edge.

Double collocated HA edge.

Say I have the following:
Cluster IP – 192.168.10.10
Cluster port – 8080
Primary Edge – 192.168.10.20
Backup Edge – 192.168.10.21


default.cfg for Primary Edge:

dscontrol set loglevel 5
dscontrol set logsize 50000000
dscontrol executor start

dscontrol executor set nfa 192.168.10.20

dscontrol highavailability heartbeat add 192.168.10.20 192.168.10.21
dscontrol highavailability backup add primary auto 8880
dscontrol highavailability reach add 192.168.10.55
dscontrol highavailability reach add 192.168.10.56

dscontrol cluster add 192.168.10.10
dscontrol port add 192.168.10.10:8080

dscontrol server add 192.168.10.10:8080:192.168.10.20
dscontrol server add 192.168.10.10:8080:192.168.10.21

dscontrol manager start manager.log 10004
dscontrol man reach set loglevel 5
dscontrol man reach set logsize 50000000
dscontrol advisor start Http 192.168.10.10:8080 Http_192.168.10.10_8080.log



default.cfg for Standby Edge:

dscontrol set loglevel 5
dscontrol set logsize 50000000

dscontrol executor start

dscontrol executor set nfa 192.168.10.21

dscontrol highavailability heartbeat add 192.168.10.21 192.168.10.20
dscontrol highavailability backup add backup auto 8880
dscontrol highavailability reach add 192.168.10.55
dscontrol highavailability reach add 192.168.10.56

dscontrol cluster add 192.168.10.10
dscontrol port add 192.168.10.10:8080

dscontrol server add 192.168.10.10:8080:192.168.10.21
dscontrol server add 192.168.10.10:8080:192.168.10.20

dscontrol manager start manager.log 10004
dscontrol man reach set loglevel 5
dscontrol man reach set logsize 50000000
dscontrol advisor start Http 192.168.10.10:8080 Http_192.168.10.10_8080.log



goActive script:

This script will remove the cluster IP from loopback and alias it to the NIC.

#!/bin/ksh

CLUSTER=192.168.10.10
LOOPBACK=lo0:1

ifconfig $LOOPBACK 0.0.0.0
dscontrol executor configure $CLUSTER



goStandby script:
This script will remove the cluster IP from NIC and alias it to the loopback.

#!/bin/ksh

LOOPBACK=lo0:1
CLUSTER=192.168.10.10
NETMASK=255.255.255.192

dscontrol executor unconfigure $CLUSTER
ifconfig $LOOPBACK $CLUSTER netmask $NETMASK up



goInOp script:
This script will remove the cluster IP from all devices (loopback and NIC).

#!/bin/ksh

CLUSTER=192.168.10.10
NETMASK=255.255.255.192

dscontrol executor unconfigure $CLUSTER
ifconfig $LOOPBACK $CLUSTER netmask $NETMASK down



The normal method to test if the high availability works smoothly is by plugging out the network cable off the edge server. I would tail the root mail (/var/mail/root) at the same time, so I could see which HA script has been triggered when the network is interrupted. Another method is to bring down the server, by rebooting it or shutting it down. With reboot you’ll only have a short time span to monitor the failover in action, but of course this depends on how long your servers take to start up.

But since this is a collocated environment, if I were to opt for either the described testing methods, I wouldn’t be able to see if the dispatcher balances all requests to both web servers accordingly (in my case I’m using the round robin algorithm). So what I did is, I manually stop the executor so that failover occurs. Note that stopping the dsserver alone won’t trigger the HA scripts. Actually it is not necessary to stop the dsserver. Well to be honest even if it’s not a collocated environment, I normally test the HA failover by stopping the executor, since normally am working remotely and plugging out the cable requires me to get the help of the sys admins. So might as well test if its really working before going through all the hassle.

One of the problem that I encountered was instability. Sometimes the dispatcher will run in the right mode (active | standby), but most of the time both will run as active. It was very unstable, no certain pattern that I could track. Even worse, sometimes when I tried ro run the dispatcher as a standalone lb, all of the incoming requests will be routed directly to the web server, skipping the dispatcher completely. I was stuck with this problem for several days when I finally figured out what the culprit is.

The ibmlb module.

Everytime when the executor is stopped, the ibmlb module will be unloaded. Everytime when the executor starts, the ibmlb module will be loaded to the kernel. I’m lucky that I have dmesg on both servers, so based from dmesg, this is how it should looked like whenever you stop and start the executor:

ibmlb DLKM successfully unloaded
ibmlb DLKM successfully loaded

But what happened was, when I stopped the executor, the ibmlb was not unloaded. The status was busy, and I’ll have to unload the module explicitly.

ibmlb DLKM successfully unloaded
ibmlb DLKM successfully loaded
ibmlb version is 06.01.00.00 – 20060515-232359 [wsbld265]
WARNING: moduload : module is busy, module id = 14, name = ibmlb
WARNING: moduload : module is busy, module id = 14, name = ibmlb
WARNING: moduload : module is busy, module id = 14, name = ibmlb
WARNING: moduload : module is busy, module id = 14, name = ibmlb
WARNING: moduload : module is busy, module id = 14, name = ibmlb

I’ve not seen anything like this before (I used to configure dispatcher on AIX servers). Consider the following test cases (arp table checked from a different server that resides on the same segment):

TEST 1.

1) Primary active, Backup standby. Cluster IP belongs to Primary.
2) Primary down, Backup goes active. Module ibmlb is UNLOADED successfully on Primary. Cluster IP belongs to Backup.
3) Primary up in active mode, Backup goes standby. Cluster IP belongs to Primary.

TEST 2.
1) Primary active, Backup standby. Cluster IP belongs to Primary.
2) Primary down, Backup goes active. Module ibmlb is busy and still LOADED on Primary. Cluster IP belongs to Backup.
3) Primary up in active mode, Backup stays active. Cluster IP belongs to Primary, but all requests will skip dispatcher and go straight to the web server.


TEST 3.

1) Primary active, Backup standby. Cluster IP belongs to Primary.
2) Primary down, Backup goes active. Module ibmlb is UNLOADED successfully on Primary. Cluster IP belongs to Backup.
3) Primary up in active mode, Backup goes standby. Cluster IP belongs to Primary.
4) Backup down. Module ibmlb is UNLOADED successfully.
5) Backup up, running in standby mode.
6) Backup down. Module ibmlb is busy and still LOADED on backup.
7) Backup up, running in active mode (remember that Primary is also in active mode too). Cluster IP belongs to Backup, but all requests will skip the dispatcher and go straight to the web server.
8 ) Backup down. Module ibmlb is busy and still LOADED on backup. Explicitly unload the module using kcmodule command until it gets UNLOADED. Cluster IP belongs to Primary.
9) Backup up, running in standby mode.

Most of the time I won’t be able to unload it right away, until I let the server ‘rest’ for about 15 – 20 minutes, before trying to unload it again. Rebooting the server will always solve this problem (the module next state is unused). Am not sure if there’s a way to force a module to be unloaded though. As far as I know there’s no force flag for kcmodule.

I was fooled several times since I tested the splash page of the web servers from my Opera browser. I was on a different subnet, so I guess there must be a switch/router in between me and the edge servers. At times, even when the cluster IP is aliased to the Primary Edge, my browser will point to the Backup Edge since the ARP cache was not refreshed. It was so annoying since this will affect the cluster report. The rest of the testings were done by running a browser from a different server but belongs to the same subnet. At least I could clear up the ARP cache manually if I have to.

Okay probably this is my browser problem, but testing the splash page with Firefox sucks. It kept on hitting the splash page even after I’ve stopped both web servers, and cleared up the cache. It was alright with Opera though. What gives?

By the way I’m using Edge v6.1. If you check out the Edge Fixpack page here, you’ll notice that there is no patch for HP-UX. Not a single patch. Is IBM trying to say something? Don’t use Edge on HP-UX, perhaps? Lol. Anyway, IBM packed me a patch (6.1.0.35), but still it didn’t address the module issue. Am not sure if I could call it a patch though, it’s more like an installer since I had to reinstall everything.

Thanks to Robert Brown from IBM for assisting me on this ‘false alarm’ panic attack (initially I thought it was a network issue).

Older Posts »