Sunday, November 22, 2020

Nice JavaScript Features



 

In this post I will review some nice JavaScript tricks that I've lately found when reading the Modern JavaScript course. I have been using JavaScript for several years, but still, I have found some  JavaScript capabilities that I was not aware of. 


In general I'm less interested in advanced JavaScript syntax/internals which is not relevant for a well written code. For example:


console.log('2' + 1) // prints 21
console.log('' || false) //prints false

console.log('' || "yes") //prints yes
console.log('' ?? "yes") //prints empty line


It is nice to know when you look at these statements why do they act the way they do, but when writing a well clean code, one should avoid such a confusing behavior.


In the next sections, I will review the JavaScript items that are more relevant in my opinion.



BigInt


JavaScript numbers support numbers in range -(253-1) up to (253-1).

But, JavaScript has a built-in support for big integers, for example:


const big1 = 1234567890123456789012345678901234567890n
const big2 = 1234567890123456789012345678901234567890n
console.log(big1 + big2)



Debugger


Once in a while, when debugging the JavaScript code in Google Chrome, I've found Google Chrome developer tools getting confused. Even I've added a breakpoint in a specific location in the source, I does not stop there. This is mostly due to Hot-Reload in a debugging session, where the sources are reloaded as new files.

A nice trick to force a breakpoint, is to add the debugger statement in the source itself:

console.log('Hello')
debugger // break here
console.log('World')


Note that this breaks in the debugger statement only if the Google Chrome developer tools window is open.



Simple User Interaction


We all use the alert function to print some debug information, but what about getting input from the user? Turns out that we have built-in functions for that:


const name = window.prompt('Your name?', 'John')
const ok = window.confirm(`Hello ${name}, click OK to continue`)
alert(ok ? 'here we go' : 'quitting')



The prompt function opens a text box with an OK and a Cancel buttons.

The confirm function opens a message with an OK and a Cancel buttons.

We will not use this in a production environment, but it is great for debugging purposes.



Nullish Coalescing


When we want to assign a value to a variable only if the value is defined, we can use the nullish coalescing operator. Unlike the OR (||) operator, it will assign the value only if the value is defined.


console.log(null || undefined || false || 'true') // prints true
console.log(null ?? undefined ?? false ?? 'true') // prints false



Optional Chaining


This is a great solution for nested object properties, when you are not sure of the existence of a property. The optional chaining prevents if-conditions to check if the property exists to avoid the "cannot read property X of undefined".


const user1 = {
name: {
first: 'John',
},
}

const user2 = {}

console.log(user1?.name?.first)
console.log(user2?.name?.first)



Object Conversion


You can control the return value upon conversion of an object to a string and to a number.


function User(name, balance = 20) {
return {
name,
balance,
toString: function () {
return `user ${name}`
},
valueOf: function () {
return balance
},
}
}

const u1 = User('John')
console.log(u1) // prints { name: "John", balance: 20}
console.log('hello ' + u1) // prints hello user John
console.log(100 - u1) // prints 80



forEach method


The forEach method can get not only the array elements, but also the array index, and the array itself.


const colors = ['red', 'blue', 'green']
colors.forEach((color, index, array) => {
console.log(color, index, array)
})



Smart Function Parameters


This is very nice method for a function with many parameters, that we want to dynamically add parameters, without affecting the existing calling code section.


function f({firstName, midName = 'Van', lastName}) {
console.log(firstName, midName, lastName)
}

f({firstName: 'John', lastName: 'Smith'})



Sealing a Property


Javascript enables us to protect an object property. This is done by configuration of the property descriptor.


const obj = {}

Object.defineProperty(obj, "secret", {
value: "You cannot delete or change me",
writable: false,
configurable: false,
enumerable: false,
})

console.log(obj.secret)
delete obj['secret'] // ERROR: Cannot delete property 'secret'
obj.secret = "new value" // ERROR: Cannot assign to read only property 'secret' of object
Object.defineProperty(obj, 'secret', {configurable: true}) // ERROR: cannot redefine property


Some more sealing abilities are listed here.


Final Note


I've wrote this post mostly for myself, but it might be relevant for other senior JavaScript engineers who are, like myself, unaware of these nice JavaScript capabilities.



Thursday, November 19, 2020

HTML Content Security Policy



 

In this post we will review the usage of a content security policy (aka CSP) which is one of the methods to prevent XSS attacks. I have already reviewed one of the XSS related attacks, in a previous post: MageCart attack.



What is CSP?


The CSP is a method to block access to resources from a web page. Using CSP we can specify the allowed list of domains from which we allow to load style-sheets (CSS), images, videos, and more.



Why Should We Use CSP?


We want to prevent leakage of private information out of our site to an attacker site. We assume that this could occur in case one of our site 3rd party integrated was attacked, and a javascript was injected into it to leak out our private information. See an example int the post: MageCart attack.



How Do We Use CSP?


To use CSP, we can add an HTML meta tag, or a HTTP header in the HTML response:

Example of a meta tag:


<meta http-equiv="Content-Security-Policy" content="default-src 'self';">


Example of a HTTP header:


"Content-Security-Policy": "default-src 'self';"


The recommended method to use is the HTTP header, since the HTML meta tag does not support all of the CSP features.

The CSP header value, contains the CSP policy, whose syntax is as follows:


(CONFIGURATION_NAME VALUE_NAME+;)+


For example:


default-src 'self'; style-src 'self' 'unsafe-inline';



Which CSP Policy Should We Use?


If we want the most security, the policy should block any access to any external resource, so we should use:


default-src 'self';


But this would cause our entire site to fail. Why?

Because any internal script loading will be blocked, for example, the following javascript:


<script>
alert('Hello World!')
</script>


is blocked.

How can we allow our internal javascripts to run?

We can use hash and nonce to allow our scripts (see this for more details), but it requires many changes to our site source, and to the server side.

A less costly method is to allow all scripts to run. This is OK only if our only purpose is to prevent data leakage, and we do not intend to prevent malicious internal site actions. To do this, we use the unsafe-inline directive:


default-src 'self' 'unsafe-inline';


The last step is to whitelist the domains that we do allow to access from our site, for example:


default-src 'self' 'unsafe-inline' https://connect.facbook.net;



Using CSP Report



Activating CSP protection is the first step in our site protection. The next step is to monitor the CSP: which URLs were blocked?
CSP provides a simple method to report the blocked resources. However, the server side should be implemented our our own.

To enable CSP reporting, use the report-uri directive:


default-src 'self' 'unsafe-inline'; report-uri http://myreport.com;

This sends a JSON request per each blocked resource.
An example of such request is:


{
"csp-report": {
"blocked-uri": "https://www.ynet.co.il/images/og/logo/www.ynet.co.il.png",
"disposition": "enforce",
"document-uri": "http://localhost:3000/domains",
"effective-directive": "img-src",
"line-number": 37,
"original-policy": "default-src 'self'; style-src 'self' 'unsafe-inline'; script-src 'self' 'unsafe-inline' ;report-uri http://127.0.0.1:2121/report;",
"referrer": "http://localhost:3000/domains",
"script-sample": "",
"source-file": "http://localhost:3000/domains",
"status-code": 200,
"violated-directive": "img-src"
}
}


Final Note


We have reviewed the HTML CSP as a method to block data leakage.

However, some attacks might use a whitelisted domain to address his own account within this domain. This can be done, for example, using the Google analytics domain.


Wednesday, November 11, 2020

Redis Pub/Sub using go-redis library

 



In this post we will review the usage of Redis Pub/Sub using a GO code that uses the go-redis library.


Our main code initiates a connection to Redis, and then starts two subscribers, and two publishers. Since we start the subscribers and the publishers as GO routines, we add sleep of 5 seconds to avoid immediate termination of the process.


package main

import (
"context"
"fmt"
"github.com/go-redis/redis/v8"
"time"
)

const channel = "my-channel"

func main() {
address := "127.0.0.1:5555"
options := redis.Options{
Addr: address,
Password: "",
DB: 0,
}
client := redis.NewClient(&options)

go subscriber(1,client)
go subscriber(2,client)
go publisher(1,client)
go publisher(2,client)
time.Sleep(5 * time.Second)
}



The subscriber loops forever on the ReceiveMessage calls, and prints them to the STDOUT.


func subscriber(subscriberId int, client *redis.Client) {
ctx := context.Background()
pubsub := client.Subscribe(ctx, channel)
for {
message, err := pubsub.ReceiveMessage(ctx)
if err != nil {
panic(err)
}
fmt.Printf("subscriber %v got notification: %s\n",subscriberId, message.Payload)
}
}


And each of the publishers sends 3 messages to the channel.


func publisher(publisherId int, client *redis.Client) {
ctx := context.Background()
for i := 0; i < 3; i++ {
client.Publish(ctx, channel, fmt.Sprintf("Hello #%v from publisher %v", i, publisherId))
}
}


Once we run this small application, we get the following output:


subscriber 1 got notification: Hello #1 from publisher 1
subscriber 2 got notification: Hello #1 from publisher 1
subscriber 2 got notification: Hello #1 from publisher 2
subscriber 2 got notification: Hello #2 from publisher 2
subscriber 1 got notification: Hello #1 from publisher 2
subscriber 1 got notification: Hello #2 from publisher 2
subscriber 2 got notification: Hello #2 from publisher 1
subscriber 1 got notification: Hello #2 from publisher 1


But, wait.

We have 2 producers, 2 subscribers, and 3 messages. That's 2*2*3 = 12 expected messages to the STDOUT, but we got only 8 messages.

The reason for that is the Redis Pub/Sub behavior, which does behave as a queue. Instead only the active subscribers will get notified with messages in the channel. As the subscribers are not active yet when the first messages are sent, these messages are not sent to any subscriber.

If we wanted all of the messages to be received, we should wait (e.g. sleep), after launching the subscribers GO routines, and before starting the publishers.


Final Note


In this post we have reviewed the Redis Pub/Sub usage, and its behavior.

When running Pub/Sub in a Redis cluster, the messages will be broadcast to all of the cluster nodes, which might be a performance issue. In case this is indeed a performance issue, it is possible to consider the Redis streams instead.

Wednesday, November 4, 2020

Using Soundex and Levenshtein-Distance in Python



In this post we will review usage of the Soundex and the Levenshtein-Distance algorithms to check words similarities. Our goal is to implement a method to decide if the a given word is similar to one of a list of a predefined well known words that we have.

For example, we could have the following list of predefined words:


predefined_words = [
"user",
"account",
"address",
"name",
"firstname",
"lastname",
"surname",
"credit",
"card",
"password",
"pass",
]


Given a new word such as "uzer", we would like to find if it match any of our predefined words.


Soundex

One method is to use the Soundex function. 

The Soundex function, creates a string of 4 characters representing the phonetic sound of the the word. The following code random words pairs, and prints the similar words:


import random
import string

import jellyfish


def match_soundex(str1, str2):
sound_ex1 = jellyfish.soundex(str1)
sound_ex2 = jellyfish.soundex(str2)
return sound_ex1 == sound_ex2


def random_word():
word_len = random.randint(4, 8)
word = ""
for _ in range(word_len):
word += random.choice(string.ascii_letters)
return word.lower()


for i in range(10000):
w1 = random_word()
w2 = random_word()
if match_soundex(w1, w2):
print(w1, w2)


and the output is:


ylqhso yloja
wpppw wbuihu
doyk dhyazgg
vvzbzpam vskpakt
gxtjh gxdzu
pgpeg pspqnug
xahbfhs xvex


Levenshtein Distance

Another method is the Levenshtein-Distance.

The Levenshtein-Distance calculates using dynamic programming, how many changes should be done in order to change one string into a second string.

The following code random words pairs, and prints the similar words. For our purpose, 2 word will be similar if less than 20% of the characters were changed.


import random
import string

import jellyfish


def match_levenshtein(str1, str2):
distance = jellyfish.levenshtein_distance(str1, str2)
min_len = min(len(str1), len(str2))
return distance / min_len < 0.2


def random_word():
word_len = random.randint(4, 8)
word = ""
for _ in range(word_len):
word += random.choice(string.ascii_letters)
return word.lower()


for i in range(10000):
w1 = random_word()
w2 = random_word()
if match_levenshtein(w1, w2):
print(w1, w2)



and the output is:

wyqg wxeo
khuqw kqosz
wvhy weve
yqspuzc ycpg
rgvo rkwgpo
nhgxbag njqvxk
woebbbkf wvkpfyf


The Lookup Implementation

Now we can use a combination of these two functions to look for similar words. if soundex or the levenshtein-distance return that we have a match, we will declare that the string if found.


for i in range(1000):
w1 = random_word()
for predefined in predefined_words:
if match_soundex(w1, predefined) or match_levenshtein(w1, predefined):
print(w1, predefined)


Final Note


In this post we have presented a method for finding words similarity.

Per my tests, the soundex function find similarities even for words that do not look similar. This is due to the default of using a 4 characters string to represent the sound of the word. The jellyfish python library (that we've used) does not allow changing the default length. For production usages, I recommend using a library that does allow the default change.