Welcome to the Enterprise Computing section of the website. This section contains information about Enterprise Architecture and software and services that are available from HMNL.
Ian Tree 03 June 2015 10:02:22 - Amsterdam
High CPU Usage by NIS.EXE after closing Google Chrome
This issue has been posted in various forums around the internet, there does not appear to have been any definitive diagnosis or fix discovered for the problem.
Starting about a month ago every time that Google Chrome was closed then a noticeable increase in CPU consumption was observed, this higher than normal CPU consumption would continue until either Google Chrome was restarted or the system was rebooted. The task manager would show that a single thread in NIS.EXE was consuming high levels of CPU. The problem was observed on a laptop running Windows 7 Professional, Google Chrome and Norton Internet Security, all software was at the latest maintenance levels.
Starting ProcMon while Google Chrome was running and then closing Chrome, waiting until the high CPU was observed and then stopping the collection in ProcMon. Examining the activity for the NIS.EXE process revealed the following behaviour. While Chrome was running NIS would attempt to check a file called c:\Users\UserName
\AppData\Local\Google\Chrome\User Data\Default\Secure Preferences (where UserName
is the current Windows user), NIS would find that there was an exclusive lock held on the file and would gracefully give up and try later. Once Google Chrome was shut down the file lock on the offending file was released and NIS would again attempt to check the file, this time it would get an "Access Denied" response from the CreateFile() attempt on the file NIS would then remain in a tight loop retrying and failing on the sequence of CreateFile() calls. NOTE:
I could see no obvious reason why the CreateFile() cal should have failed.
Excluding the offending file from both Scan and SONAR in the NIS settings appears to have resolved the problem.
Google Chrome modifies the file during startup and this triggers a file update notification that is picked up by NIS. NIS attempts to examine the file and hits the exclusive lock held by Chrome and so gives up, making a note to try later. On shutdown of Chrome the file is again modified and a new notification is again raised and picked up by NIS. There is some condition in the security attributes of the file that cause the NIS attempts at access to fail, the particular access failure is not expected by NIS and it continuously retries the operation on the file, this tight retry loop consumes the high level of CPU.
After about 24 hours the behaviour returned, examination with ProcMon showed the same pattern of activity on the executable file: c:\Users\UserName
\AppData\Local\Google\Chrome\Application\Chrome.exe so this file was also excluded from the Scan and SONAR in the NIS settings.
Ian Tree 10 March 2015 21:22:20 - Eindhoven
DX Tools - Kernel 3.14 is Released
A new version of the DX Tools kernel (DXCommon) Version 3.14.1 is now available for download.
The new release supports Domino on Windows 64 bit platforms. There are additionally many low-level functional additions such as the support of selectable ODS levels for target database in the database copier.
New versions of the individual DX Tools will be available as they are updated to take advantage of the new features in the kernel. QCopy Version 2.6.2 is already available.
Go to the DX Tools section of the website for further information and download links.
Ian Tree 22 January 2015 17:31:31 - Eindhoven
A Reasonable Expectation of Privacy – My Internet Passwords
A password that I use to access an internet service is my private property there is no valid reason why anyone, including the provider of the internet service that it grants access to, needs to know what my password is. When I attempt to authenticate with a particular internet service the service provider only needs to confirm that the password and identity combination that I have provided are the same values that were used when I registered with the service or last changed the password.
Passwords should only ever be stored in the service providers credential store once they have been one-way encrypted or uniquely hashed. One way encryption schemes are simple, generate an asymmetric encryption key pair and then throw away one of the keys from the pair use the remaining key to encrypt passwords with a reasonable certainty that they can never be decrypted.
For authentication the service encrypts the password offered using the same one way key and then compares it with the encrypted password stored in the credential store, if it matches then access is granted to the service otherwise access is refused.
There are a few implementation details to look out for, as ever, the devil is in the detail. Make sure that all messages containing passwords are encrypted in transport using TLS/HTTPS or similar transport session protection. Make sure that messages containing passwords are encrypted at the highest layer in the software stack after the transport decryption has been applied, make sure that there is no possibility of logging any message containing a decrypted password. Passwords make very short messages to encrypt so ensure that the encryption or hashing message is good at handling short messages; to this end it is a good idea to salt the password with other information from the user credentials to increase the entropy of the message.
Passwords are still vulnerable to discovery if they have a low password strength as they can by guessed or determined through dictionary attacks; users should guard against the former and service providers should guard against the latter. Users must of course remain vigilant for social engineering attacks that attempt to get them to supply passwords of their own accord. Service providers also need to guard against social engineering attacks against their lost/forgotten password reset processes.
I first came across this methodology in 1978 and first used it in anger in the early eighties, so it is not novel or unusual but it is discomforting to see how many services do not implement this or an equivalent level of protection for users passwords.
Ian Tree 28 October 2013 10:58:00 - Eindhoven
Just When Did Data Get Big?
It never did, of course. Data and the demands on processing it have always been bigger than the box of processing capacity that we had to put it in.
I was having a discussion the other evening with some fellow enterprise architects over some of the design challenges they faced in handling the Big Data needs of their respective organisations. Somewhat fuelled by the fact that one of them was picking up the drinks tab I was forced to confront them with the fact that the problems they were grappling with were not the new high frontier of enterprise IT. We have been here before.
Back in the Day
"Back in the Day" in this case is back in the 70's. In those days financial and marketing organisations were crying out for more data to drive their strategic and operational systems and technological development was increasing our capacity to process data. In the marketing arena we had to deal with ever increasing volume and variability of data and the demand to process it at an ever increasing velocity. The challenge was that there were only 24 processing hours in a day and every day there was a new operational schedule of processing that needed to be completed and projections of elapsed processing times were always heading for 36 hours for the daily schedule in the next 6 months.
There was "NoSQL", sorry, I mean "No SQL" at that time. Ted Codd was still busy working on the theory and IBM was toying with experimental designs for "System R". Even if SQL had been around disk storage was so slow, low capacity per unit and hugely expensive that a disk based database of the size needed was a dream too far. The marketing master database was a multi-reel tape sequential file, some of these databases had quite sophisticated data structures consisting of a sequentially serialised form of a hierarchic data structure. Tapes and decks of cards would arrive from data preparation bureaus containing event records and update transactions that needed to be applied to the master database. The incoming transactions would be validated and then sorted into the sequence of the master database. Once the daily input was available then the "Daily Update Run" would commence. The daily update run was a copy with updates process that would read the latest generation of the master database and write a new generation with the updates applied and new records inserted. Once the daily update job was completed then a number of analytics and reporting jobs would have to be run, each requiring a sequential pass through the complete master database. After the analytics jobs had been run then the extraction jobs would be run to extract complex selections of a subset of the data these extracts would be used for generating direct marketing offers and other purposes, again each of these jobs would require a complete sequential pass through the master database.
Rising to the Challenge
We started out by partitioning the master database into first two and later into four separate datasets, this allowed the four daily update jobs to be run in parallel. The analytic and reporting jobs were split and their respective outputs would be combined in a new job to produce the final outputs, a kind of primitive Map-Reduce solution. The data extraction jobs were a little more challenging to partition as they were built around some sophisticated fulfilment algorithms that dynamically adjusted record selection probabilities based on the recent achievement history, these required additional work so that they would perform correctly on the partitioned master database. The changes made to the extraction algorithms also resulted in us being able to run all of the extractions required in a single set of jobs, further reducing the elapsed run time for the complete schedule.
Don't get me wrong, I think that we doing some fantastic things with Big Data these days and are finding new and creative solutions to many of the associated problems. However, I also think that we too readily dismiss some of the approaches and solutions that were applied in the past to the equivalent classes of problems.
"Grey hair get's that colour because it leaks from the grey cells beneath."
Ian Tree 08 October 2013 15:03:45 -
Calibrating some String Search Algorithms
Last weekend I had the need to calibrate a new test environment in preparation for doing some performance benchmarking. I cast around for an old benchmark application that would not take much set up time and I came across an application that performed comparative benchmarks on different string search algorithms. I compiled the application with both Visual Studio 2005 and Visual Studio 2012 and ran through the test series.
In the benchmark application each algorithm is exercised by performing 10,000 cycles of a series of 5 searches (4 hits and one miss), the application reports the elapsed time in milliseconds for each test. The results are shown below.
|VS 2005||VS 2012|
|Algorithm||Debug||/Od (disable)||/O2 (speed)||Debug||/Od (disable)||/O2 (speed)|
|Reference builtin (strstr)||7,332||7,254||8,128||843||718||873|
The results of the Knuth-Morris-Pratt tests should probably be discounted, I don't remember it being that slow, I suspect that the test application has some experimental modifications in that algorithm.
What is interesting is the near 10 fold
improvement in performance of the builtin strstr() function between the Visual Studio 2005 and the Visual Studio 2012 implementations. This improvement makes the builtin function the outright winner from this test set in the VS 2012 implementation.
Ian Tree 22 September 2013 13:10:02 -
Talk Almost on Autonomic Computing
Summary of a talk given to a group of developers on the topic of "Autonomic Optimisation", the talk went off the rails somewhat - but then again maybe not.
Read the summary here
Ian Tree 11 September 2013 12:20:09 - Eindhoven
Games and Personal Software
You may have noticed on your way here a new section of the web site that is dedicated to "Games and Personal Software", that area of the site is used to publish all sorts of personal software mainly derived from stuff that we have laying about the lab. We have accumulate an awful lot of developed software over the years, in-house utilities, test platforms for particular programming constructs and everything in between. Some of te software is open source and available free to download while other software is available to purchase.
Click here Games and Personal Software to take a look at what is on offer.
Ian Tree 05 September 2013 07:40:36 - Eindhoven
New Domino Tooling
New in Domino 9
Installation and use of the "Advanced Designer" requires the following to be installed and working.
Brain R5.2.1 FP2 (although tests have shown good backwards compatability to R3.0.1).
Paper V1.7 or later installed on Desk (all versions) or ClipBoard for mobile users.
These components are fully Web 2.0 compatible.
Warning Do not develop Domino applications without the supervision of a grown up.
Ian Tree 10 May 2012 12:23:25 - Eindhoven
The Domino eXplorer
The Domino eXplorer (DX) was developed as a means for facilitating the rapid development of C++ tools to be used in projects that involve high volumes of data transformation. DX has been, and continues to be developed for use across a wide range of Domino versions and platforms. The tool set is also appropriate for Business Intelligence applications that have to process "Big Data" in Notes Databases. The reference platforms are Domino 9.0.x on Windows Server 2003 R2 (32 bit and 64 bit) and Red Hat Linux 6.6. DX is also used as a research tool to investigate various aspects of Autonomic Systems, in particular Autonomic Throughput Optimisation.
Standardised utilities have also been built around some of the functional DX classes, these are published as “DX Tools” and can save time by providing off-the-shelf processing to be incorporated into complex transformations that need high throughput rates.
DX is NOT a framework (we hate frameworks), instead it provides a grab bag of classes that can be assembled in different designs to provide high throughput processing of Notes databases. There are certainly some constraints imposed by the relationships between different objects and contracts imposed by the API, these have been kept as minimal as possible.
All software and associated documentation distributed under the banners of "Domino eXplorer" and "DXCommon" and the "DX Tools" is distributed as Free software. It is distributed free of charge, free of obligation and free of restriction of use. All software is distributed as a zipped package for Windows and a gzipped tarball for Linux, documentation is supplied in Word for Windows 97-2003 and Open Document Text formats for editing and PDF and XPS formats for viewing. All Notes Databases are distributed as "Virtual Templates" available on the internet.
The published versions of the software are all the latest fully featured production versions as currently used by our teams in various high volume transformation projects.
- Extensive functional coverage
- Modular design and implementation
- Full documentation
- Field proven
- Comprehensive and comprehensible