To load from STDIN, you simply omit the file name/directory name argument. (When loading from a data file, the last argument to ncluster_loader is the data file name.) For example, to import our UK sales data and replace the product name “NappyPlus” with “DiaperPlus” we could use sed and ncluster_loader together like so:
$ sed s/NappyPlus/DiaperPlus/ 2010_UK_sales.csv | ./ncluster_loader -U mjones -w st4g0l33 sales_fact -c -l 10.51.50.240
Loading from STDIN lets you pipe your data through useful tools like sed, the text stream editor. Here’s an example of sed at work, ensuring that backslash escape sequences are properly formatted in the input data before Aster Loader sees the data.
Here’s some sample data:
# cat sampleData-3.tsv 5 How often do back-slash characters ('\') appear in your data? 6 And how often do you think they actually disappear: 1 \ ? 2 \ ? 3 \ ? 7 \W\a\y \t\o\o \o\f\t\e\n\! \! \!
Here’s how we run Aster Loader, piping its input data through sed:
$ cat sampleData-3.tsv \
| sed -e 's_\\_\\\\_g' \
| ncluster_loader -h $QUEEN_IP -d my_db -U beehive -w beehive testo /dev/stdin
Loading tuples using node '192.168.28.100'.
3 tuples were successfully loaded into table 'test'.
Here are the result rows:
$ act -h $SYSMAN_IP -d my_db -U beehive -w beehive -c 'SELECT * FROM testo ORDER BY id;' id | string ----+--------------------------------------------------------------------- 1 | This is just a line. 5 | How often do back-slash characters ('\') appear in your data? 6 | And how often do you think they actually disappear: 1 \? 2 \? 3 \? 7 | \W\a\y \t\o\o \o\f\t\e\n\! \! \! (4 rows)