Posts

AWS Load Balancers and HTTPS

I was helping a client with his web server that is hosted in AWS (Amazon Web Services) EC2. He had gotten a certificate to enable HTTPS but it wasn’t working.

AWS offers free certificates, but you can’t install them into the EC2 web server. In this case, he had set up a load balancer in front of the web server and the Certificate Manager certificate was set up there. This means that when the end user browses to this website, the browser is really talking to the load balancer and load balancer is talking to the web server and passing information back and forth.

I made some assumptions about how he had set up the load balancer forwarding so it took me awhile to get my arms around what was going on. I was configuring the Apache web server to do redirects in the .htaccess file. He wanted to force browsers to use HTTPS and wanted to make “www” his “authoritative URL”, meaning if someone typed “domain.com” into their browser, it would redirect them to “www.domain.com”. (This is a good idea for SEO. Google doesn’t assume/realize that domain.com and www.domain.com are the same website.)

http://domain.com was redirecting perfectly to https://www.domain.com, but http://www.domain.com was not redirecting to https://www.domain.com. I finally realized that the load balancer forwarder was configured via HTTPS and incoming HTTP and HTTPS traffic was forwarding to the webserver over HTTPS, but the load balancer was communicating back to the browser on whatever protocol they came in on. I set up the load balancer to communicate with web server using HTTP and then the redirects flowed properly back to the browser.

It’s easier to configure the load balancer to communicate with the web server using HTTP and just handle the encryption in front of the load balancer.

Posted in IT

Hosted PBXs require fine-tuned firewalls

I’ve been working with a client who has moved to a cloud based VOIP PBX server. In general, I’m a fan of this (and just about anything “cloud”) – but, there are a lot of firewall configurations that need to be just right to make this work well.

This is particularly true if you only have a single server at the hosting data center. Multiple phones at your physical location talking to a server on the other side of your Internet connection is tricky.

A better configuration involves a secure VPN connection between your physical location and your hosting provider. If they’re offering is a single server, they may not be set up to do this. Take a look at setting up a small network inside Amazon Web Services (AWS), Microsoft Azure, or Google Cloud. You should be able to setup a VPN between your physical location and any of these. Once that’s done, from your PBX and phones’ point of view – they are communicating on the same network which is much more straightforward.

Explaining NAT (Network Address Translation) is beyond the scope of this article, but that’s the complicating difference that the VPN eliminates.

Posted in IT

.Net/Web development challenges with Time Zones (Part 2)

This situation is completely obvious to me in hindsight, but I have to confess – I never thought about this until a couple of weeks ago.

As I talked in about in Part I, cloud computing increases the likelihood that servers are in different time zones than users. It seems like the simple solution to this is to store date/time values in UTC (what we used to call Greenwich Mean Time, or GMT).

If the server’s time zone is set to UTC, this works pretty well. If you send date/time data to the browser in JSON, Javascript running in the user’s browser will adjust it to the local time zone for you. (If you’re sending the date/time value in HTML text, you have to do this yourself.)

The other day I was debugging an application from my development computer, where the time zone is not set to UTC. (I live in Central time zone.) Time values that I had saved to the database in UTC were not being displayed in the browser correctly.

It finally dawned on me that even though I used C#’s DateTime.UtcNow to generate the time originally, the datetime value stored in the database had no concept of which time zone it was in and that when my computer read the value from the database – it assumed that the datetime value was in my computer’s time zone (Central) as opposed to UTC.

In C#, the solution to this problem involves the DateTime.Kind property. I’ll confess – I had never heard of this or used it. By using the SetKind method, you can tell the system whether or not a DateTime is in local time zone or in UTC. Once I started using this, the time values on the browser began displaying correctly.

I’ve been programming a long time and this issue had never occurred to me. Hopefully it has occurred to you, but I’m betting there are at least a few people who read this who are in the same boat as me. Whether or not you’re using C#, this concept (what time zone at DateTime value is stated in) needs to be addressed in an application where servers and browsers are in separate time zones.

.Net/Web development challenges with Time Zones (Part 1)

One of the learning curve issues when moving from a client-server environment where all the application users were in the same building to developing for the web is dealing with different time zones.

Dates have times whether you want them or not

The first time I remember being bitten by the time zone issue was a several years ago and it’s probably not what you’d expect.

My application had an Order table in a SQL Server database with an OrderDate field (DateTime data type). I wasn’t interested in the time part, so I was setting the value by using DateTime.Today, which gives today’s date with the time part set to 00:00:00 (midnight).

My test user called me to say that his OrderDate values were showing up off by 1 day. When retrieving the order, the web server was returning the order information in JSON. I didn’t realize at the time that browsers will automatically “time-zone-shift” JSON dates to the browser’s computer’s time zone. So, if the order date was “1/1/2019 00:00:00” and the test user’s time zone was 6 hours behind UTC – the browser (not realizing that the time part was not significant) translated that value to “12/31/2018 18:00:00”. My UI was formatting the date to only show the date part, so the OrderDate field value was showing as “12/31/2018” when it should have been “1/1/2019”.

This turned out to be a much trickier problem than I initially thought. This was when I realized the need for a Date data type (since there wouldn’t be any concept of time zone shifting with a Date-only type). I know SQL Server has a Date type, but C# still doesn’t.

At the time, I was in a hurry so I cheated a little bit. I started stamping the time part as noon (12:00:00) as opposed to letting it default to midnight (00:00:00). I didn’t care about the time part and it’s never displayed. This gives me 12 hours leeway in either direction with the date part being changed. Apparently some islands in the Pacific do have +13 and +14 time zones, but I was pretty sure that particular application wasn’t going to be used there.

That application has since been retired but I never found a more elegant solution to this problem. If anyone knows of one, please let me know.

Free SSL Certificates

About a year ago, Google started flagging unencrypted (available using HTTP as opposed to HTTPS) websites as “Not Secure” in the Chrome address bar. They have also started taking into account whether or not a site has HTTPS for purposes of search rankings. In other words, lack of HTTPS will affect your SEO.

Side note: HTTPS encryption is frequently referred to as SSL and the certificates that allow this are almost always referred to as “SSL certificates”, but this term is not technically accurate any more. SSL was the original cryptographic protocol used for HTTPS but it is obsolete and not considered secure any longer. TLS is what’s used for HTTPS encryption now, but the term “SSL” stuck.

For years, the general consensus was that you needed HTTPS for sites where you entered a credit card or things like that, but that for general information sites (like blogs) there was no need to encrypt the information. That consensus has changed over the last few years.

Most people have historically bought “SSL certificates” from a vendor like GoDaddy with prices starting around $75/year. A few years ago, a service called Let’s Encrypt was introduced by the Internet Security Research Group. Basically – they offer free certificates to encourage people to use HTTPS.

It sounds too good to be true and I was skeptical when I first heard about it, but it’s legitimate. I’ve been using their certificates for about a year. I’ve used them for websites running in AWS and Azure. There’s a little bit of a learning curve in learning how to get them to issue the certificates for you but once you figure it out, you won’t ever need to pay for certificates any more. (Blatant commercial message – we can help you with this learning curve.)

Posted in IT

Purge Your Data

Georgia Tech has suffered a data breach. I hadn’t heard about this until today, when my college age son received a letter from the school letting him know that his personal information (including date of birth and Social Security number) “may have been accessed”.

The Atlanta Journal-Constitution reported that “1.3 million current and former students, faculty and staff members” may have been affected. The article also notes the irony that this happened (twice) to the “world renowned university with lauded computer science programs”.

My son has never been a student at Georgia Tech. He did apply a couple of years ago but never enrolled. I can’t think of any reason why his personal information still needs to be in their systems.

It gets worse.

Two letters from Georgia Tech were in my mailbox today – the one addressed to my son, and one addressed to a former resident who I happen to know. (She taught one of my other sons in high school a couple of years ago.) Her letter was sent to where she lived in high school and used her maiden name, so I’m assuming she also applied to Georgia Tech. (I also know that she didn’t attend Georgia Tech.)

We’ve lived in this house for 16 years, so she applied at least that long ago but I’m betting it was closer to 20 years ago. I really can’t imagine that Georgia Tech needs personal information but applicants who didn’t enroll from 20 years ago.

Too many systems are designed with people thinking of how to get data into the system without any thought of purging it when it’s not needed anymore.

Georgia Tech’s enrollment is ~27,000. If you do some basic math, it’s hard to come up with a good reason for there to be 1.3 million people’s personal information in their systems.

Breaches happen, but I’d much rather do crisis management for a breach affecting 100,000 than 1 million.

Purge your data.

Posted in IT

Stuck print jobs make Windows Server UI drag

There’s a Windows Server (2012 R2 to be specific) at a client’s location that I occasionally have to login to via RDP. The last couple of months the UI performance was painfully slow. Because I usually just needed to get in, check something quickly, and get out – I hadn’t spent any time tracking down what was going on. (It was still performing its “server duties” adequately so this wasn’t a huge priority.)

The other day, I decided I would spend a few minutes tracking it down. It turned out that there were 51 jobs in a print queue for a printer that was no longer physically on the network, but the printer share was still available on the server.

As soon as I deleted these print jobs (and the printer share), the UI performance was dramatically better. Obviously the shared printer shouldn’t have been left there and I know that the print jobs were taking up RAM, but I was surprised at how drastic the effect on performance was.

Posted in IT

Consider CAPTCHA

A few days ago, one of my clients called to say that their credit card processor had suspended their account because their website was being used to submit fraudulent charges and that a CAPTCHA mechanism needed to be added before they would reactivate the account.

By Scooooly – Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=47265558

We’ve all been required to complete a CAPTCHA process when doing something online. I knew the general idea was to make sure that the website was being used by a human and not a “bot”, but, I’ll confess – I really hadn’t given them much thought prior to this.

Google’s reCAPTCHA documentation begins by saying “reCAPTCHA protects you against spam and other types of automated abuse”. This was an eye-opening example of automated abuse. I’m still not exactly sure what was being attempted, but my best guess is that hackers were using the site to test combinations of digits to come up with valid credit card numbers, or something along those lines. Because this form was setup for donations and no merchandise was going to be shipped as a result of valid credit card charges, I hadn’t thought about the possibilities for this to be abused.

Example of Google reCAPTCHA v2

Luckily, using Google’s reCAPTCHA tools – it only took an hour or so to add the functionality to the site. I used v2 (pictured above) because that’s the one I was most familiar with, but I was very interested to learn that Google has now released reCAPTCHA v3 that doesn’t require any user interaction.

Now that I’ve seen an example of non-obvious (at least to me) abuse and how easy it was to add CAPTCHA functionality, I’ll be reviewing other sites to see if there are other places it would make sense to use this. I’m encouraging you to do this to.

(BTW – I never knew that CAPTCHA was an acronym for “completely automated public Turing test to tell computers and humans apart”.)

SendGrid Inbound Parse

I’ve spent most of a weekend trying to figure out how to get attachments from SendGrid’s Inbound Parse service. If you haven’t heard of it – it accepts an incoming email, parses out the various parts and sends them to you by posting to a web form you have set up for this.

The “text” parts of the message (from, to, subject, etc.) were pretty straightforward, but trying to figure out how to get files that had been attached to the message took me a long time to figure out. (Much longer than it should have, in hindsight.)

I followed the recommendations of the tutorial video and configured Inbound Parse to send the information to RequestBin, a service that will accept and show you posted form values without your having to write any code.

The values that I sent to RequestBin agreed with the documentation. There was an “attachments” value that would tell how many attachments were included and “attachmentX” value(s) for each attachment. (If attachments = 2, there would be an attachment1 value and an attachment2 value.)

I reconfigured SendGrid to post the values to an Azure website I have. The code that I wrote based on the SendGrid documentation (and what I saw on RequestBin) kept throwing exceptions, so I took advantage of Azure’s remote debugging feature (which is awesome, by the way).

The attachment-related values weren’t showing up. SendGrid has posted 13 form values to RequestBin, but I was consistently only getting 9 in my ASP.Net application. For 2 days, I assumed that something was going on with ASP.Net, thinking that it was related to request validation, or something like that.

After hours of fighting with this, I finally realized that when I had re-configured SendGrid to post to my Azure site instead of RequestBin, I had checked the “SendRaw” checkbox. The explanation next to the checkbox says “This will post the full mime message”. What it doesn’t tell you is – it changes what SendGrid posts.

If you check the SendRaw checkbox – you won’t get these form values:

  • attachments
  • attachment-info
  • attachmentX
  • text
  • html

Instead you will get an “email” value that has the original email message.

Since the whole point of using the Inbound Parse is to not have to parse the email, I’d recommend not checking the SendRaw checkbox.

DATALENGTH vs. LEN in T-SQL (or detecting trailing spaces gotchas)

A client reported an issue with a SQL Server Reporting Services (SSRS) report that had recently been upgraded from 2008R2 to 2012.  The report was splitting a particular area into 2 pages (breaking) where it used to print all of that area on the same page, which is how the client still wanted it to print.

After looking at layout issues, I finally realized that the actual issue was that a group definition was evaluating some string values as different when they appeared to be the same.

The report was designed to look like this:

Group Header for value AA

Details for value AA

Group Footer for value AA

Group Header for value AB

Details for value AB

Group Footer for value AB

When the report ran – I got two sections (complete with header and footer) for value “AA”. Some of the details showed up in one section and some in the other.  When I ran the query – it looked like all the rows had the same value (“AA”).  I suspected trailing spaces or some other non-visible character, so I ran a select distinct … query and only got 1 occurrence of “AA”. I added a “where Field = ‘AA'” clause and got all of the rows. I added a LEN(Field) column and all of them showed up as “2”.

DATALENGTH to the rescue.  If you read this – you can see that SQL Server basically ignores trailing spaces when doing a string comparison. If you look at the documention for the LEN function, it says “… excluding trailing blanks”.

Turns out that LEN returns the number of characters and DATALENGTH returns the number of bytes. Once I added a DATALENGTH(field) column – I could see that some returned 2 and some returned 10.  My query was UNIONing several data sources together and some of the “AA” values were apparently defined in a CHAR(10) field.

I added some RTrim statements to the part of the query that was showing DATALENGTH of 10 and the report now ran as expected.

I’m not sure if it’s SQL or SSRS, but something has changed regarding this between 2008R2 and 2012, because this report ran for years the old way.

Posted in SQL