1 Star 0 Fork 0

Sudhesh / gitee-word-count

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
Snakefile 1.41 KB
一键复制 编辑 原始数据 按行查看 历史
S Kumar 提交于 2020-11-05 13:49 . initial commit
# a list of all the books we are analyzing
DATA = glob_wildcards('data/{book}.txt').book
# this is for running on HPC resources
localrules: all, make_archive
# the default rule
rule all:
input:
'zipf_analysis.tar.gz'
# count words in one of our books
# logfiles from each run are put in .log files"
rule count_words:
input:
wc='source/wordcount.py',
book='data/{file}.txt'
output: 'processed_data/{file}.dat'
threads: 4
log: 'processed_data/{file}.log'
shell:
'''
echo "Running {input.wc} with {threads} cores on {input.book}." &> {log} &&
python {input.wc} {input.book} {output} >> {log} 2>&1
'''
# create a plot for each book
rule make_plot:
input:
plotcount='source/plotcount.py',
book='processed_data/{file}.dat'
output: 'results/{file}.png'
shell: 'python {input.plotcount} {input.book} {output}'
# generate summary table
rule zipf_test:
input:
zipf='source/zipf_test.py',
books=expand('processed_data/{book}.dat', book=DATA)
output: 'results/results.txt'
shell: 'python {input.zipf} {input.books} > {output}'
# create an archive with all of our results
rule make_archive:
input:
expand('results/{book}.png', book=DATA),
expand('processed_data/{book}.dat', book=DATA),
'results/results.txt'
output: 'zipf_analysis.tar.gz'
shell: 'tar -czvf {output} {input}'
1
https://gitee.com/sudhesh/gitee-word-count.git
git@gitee.com:sudhesh/gitee-word-count.git
sudhesh
gitee-word-count
gitee-word-count
master

搜索帮助