logo资料库

Kettle入门教程(详细介绍控件使用方法).pdf

第1页 / 共186页
第2页 / 共186页
第3页 / 共186页
第4页 / 共186页
第5页 / 共186页
第6页 / 共186页
第7页 / 共186页
第8页 / 共186页
资料共186页,剩余部分请下载后查看
Kettle 3.2使用说明书
概述
1.Kettle资源库管理
1.1 新建资源库
1.2 更新资源库
1.3 资源库登陆和用户管理
1.4 资源库登录和没有资源库登录的区别
2.菜单栏介绍
2.1 文件
2.2 编辑
2.3 视图
2.4 资源库
2.5 转换
2.6 作业
2.7 向导
2.8 帮助
2.9 变量
2.9.1 变量使用
2.9.2 变量范围
3.工具栏介绍
3.1 转换 Transformation工具栏
3.2 工作 Jobs工具栏
4.主对象树
4.1 转换主对象树
4.1.1 新建转换
4.1.2 转换设置
4.1.3 DB连接
4.1.4 Steps(步骤)
4.1.5 Hops(节点连接)
4.1.6 数据库分区 schems
4.1.7 子服务器
4.1.8 Kettle 集群 schems
4.2 Jobs主对象树
4.2.1 新建 Job
4.2.2 设置 Job 属性
4.2.3 DB 连接
4.2.4 作业项目
4.2.5 子服务器
5. 转换核心对象
5.1 Transform
5.2 Input
5.3 输入
5.3.1 Access Input
5.3.2 CSV file input
5.3.3 Cube 输入 多维立方
5.3.4 Excel 输入
5.3.5 Fixed file input
5.3.6 Generate random value
5.3.7 Get file Names
5.3.8 Get Files Rows Count
5.3.9 Get data from XML
5.3.10 LDAP Input
5.3.11 LDIF Input
5.3.12 Mondrian Input
5.3.13 Property Input
5.3.14 Streaming XML Input
5.3.15 XBase 输入
5.3.16 XML 输入
5.3.17 文本文件输入
5.3.18 生成记录
5.3.19 获取系统信息
5.3.20 表输入
5.4 输出
5.4.1 Access Output
5.4.2 Cube 输出
5.4.3 Excel Output
5.4.4 Properties Output
5.4.5 SQL File Output
5.4.6 XML 输出
5.4.7 删除
5.4.8 插入/更新
5.4.9 文本文件输出
5.4.10 更新
5.4.11 表输出
5.5 查询
5.5.1 Check if a column exists
5.5.2 File Exists
5.5.3 HTTP client
5.5.4 Table exists
5.5.5 Web 服务查询
5.5.6 数据库查询
5.5.7 数据库连接
5.5.8 流查询
5.5.9 调用 DB 存储过程
5.6 转换
5.6.1 Abort
5.6.2 Add XML 增加 XML
5.6.3 Add a checksum 增加检查和
5.6.4 Analytic Query 分析查询
5.6.5 Append Streams
5.6.6 Blocking Step 被冻结的步骤
5.6.7 Clone row
5.6.8 Closure Generator 闭包生成器
5.6.9 Data Validator 数据检测
5.6.10 Delay row 延迟行
5.6.11 Identify last row in a stream 标记流中最后一行
5.6.12 Metadata structure of stream 流中元数据结构
5.6.13 Null if 设置为空值
5.6.14 Row Normaliser 行正规化
5.6.15 Split field to rows 分离行
5.6.16 Switch / case
5.6.17 XSD Validator
5.6.18 XSL Transformation
5.6.19 值映射
5.6.20 分组
5.6.21 去除重复记录
5.6.22 增加常量
5.6.23 增加序列
5.6.24 字段选择
5.6.25 拆分字段
5.6.26 排序记录
5.6.27 空操作
5.6.28 行扁平化
5.6.29 行转列
5.6.30 计算器
5.6.31 过滤记录
5.7 连接
5.7.1 Merge Join
5.7.2 Sorted Merge
5.7.3 XML Join
5.7.4 合并记录
5.7.5 记录关联(笛卡尔输出)
5.8 脚本
5.8.1 Modified Java Script Calue
5.8.2 Regex Evaluation
5.8.3 执行 SQL 脚本
5.9 数据仓库
5.9.1 维度更新/查询
5.9.2 联合更新/查
5.10 映射
5.10.1 映射(子转换)
5.10.2 映射输入规范
5.10.3 映射输出规范
5.11 作业
5.11.1 Get Variables 获得变量
5.11.2 Get files from result
5.11.3 Set Variables 设置变量
5.11.4 Set files in result
5.11.5 从结果获取记录
5.11.6 复制记录到结果
5.12 内联
5.12.1 Injector
5.12.2 Socket reader
5.12.3 Socket writer
5.13 实验
5.14 不推荐的
5.14.1 聚合记录
5.15 Bulk loading
5.16 History
6. 任务 Jobs核心对象
6.1 General
6.1.1 Dummy Job
6.2 通用
6.2.1 START
6.2.2 Dummy Job
6.2.3 中断任务
6.2.4 显示消息对话框
6.2.5 任务(Job)
6.2.6 Ping a host
6.2.7 Success
6.2.8 文本输出
6.2.9 Write to Log
6.3 邮件
6.3.1 Write to Log
6.3.2 Mail
6.4 文件管理
6.4.1 向结果中添加文件名
6.4.2 比较文件夹
6.4.3 拷贝文件
6.4.4 拷贝或移动结果文件名
6.4.5 新建文件夹
6.4.6 新建文件
6.4.7 删除文件
6.4.8 从结果集中删除文件名
6.4.9 删除文件
6.4.10 删除文件夹
6.4.11 文件比较
6.4.12 HTTP
6.4.13 Move FIles
6.4.14 文件解压缩
6.4.15 等待文件
6.4.16 文件打包
6.5 条件
6.5.1 检查文件夹是否为空
6.5.2 检查文件是否存在
6.5.3 检查数据库表中的列是否存在
6.5.4 检查文件存在
6.5.5 检查表是否存在
6.5.6 等待
6.6 脚本
6.6.1 Mail
6.6.2 SQL
6.6.3 SHELL
6.7 批量加载
6.7.1 批量从 Mysql 中加载数据至文件
6.7.2 从文件中向 MS SQL Server 数据库中批量加载
6.7.3 从文件中向 Mysql 数据库中批量加载
6.8 XML
6.8.1 Check if XML File is well formed
6.8.2 DTD Validator
6.8.3 XSD Validator
6.8.4 XSL Transformation
6.9 文件传输
6.9.1 FTP
6.9.2 FTP Delete
6.9.3 Put a file with FTP
6.9.4 Put a file with SFTP
6.9.5 SSH2 Get
6.9.6 SSH2 Put
6.9.7 Secure FTP
6.10 资源库
6.10.1 Check if connected to repository
6.10.2 Export repository to XML file
6.11 实验
6.11.1 Evaluate rows number in a table
6.11.2 MS Access Bulk Load
6.11.3 Set variables
6.11.4 Simple evaluation
6.11.5 Truncate tables
6.11.6 Wait for SQL
Kettle 3.2 使用说明书 目录 概述...........................................................................................................................................7 1.Kettle 资源库管理.................................................................................................................7 1.1 新建资源库.................................................................................................................7 1.2 更新资源库..............................................................................................................11 1.3 资源库登陆和用户管理..........................................................................................12 1.4 资源库登录和没有资源库登录的区别..................................................................16 2.菜单栏介绍..........................................................................................................................18 2.1 文件..........................................................................................................................18 2.2 编辑..........................................................................................................................19 2.3 视图..........................................................................................................................21 2.4 资源库......................................................................................................................21 2.5 转换..........................................................................................................................22 2.6 作业..........................................................................................................................25 2.7 向导..........................................................................................................................26 2.8 帮助..........................................................................................................................26 2.9 变量..........................................................................................................................26 2.9.1 变量使用........................................................................................................26 2.9.2 变量范围.......................................................................................................26 2.9.2.1 环境变量............................................................................................26 2.9.2.2 Kettle 变量.........................................................................................27 2.9.2.3 内部变量............................................................................................27 3.工具栏介绍..........................................................................................................................28 3.1 转换 Transformation 工具栏 ....................................................................................28 3.2 工作 Jobs 工具栏......................................................................................................29 4.主对象树..............................................................................................................................30 4.1 转换主对象树..........................................................................................................31 4.1.1 新建转换.......................................................................................................32 4.1.2 转换设置.......................................................................................................32 4.1.3 DB 连接 .........................................................................................................37 4.1.4 Steps(步骤) ....................................................................................................40 4.1.5 Hops(节点连接).............................................................................................40 4.1.5.1 右键节点连接,可以新建和排序连接.............................................41 4.1.5.2 右键单击每个具体连接,可以编辑和删除该节点连接的属性 .....42 4.1.6 数据库分区 schems ......................................................................................42 4.1.7 子服务器.......................................................................................................43 4.1.8 Kettle 集群 schems ........................................................................................43
4.2 Jobs 主对象树...........................................................................................................44 4.2.1 新建 Job ........................................................................................................44 4.2.2 设置 Job 属性 ...............................................................................................45 4.2.3 DB 连接 ......................................................................................................45 4.2.4 作业项目....................................................................................................47 4.2.5 子服务器.......................................................................................................47 5. 转换核心对象....................................................................................................................47 5.1 Transform..................................................................................................................48 5.2 Input ..........................................................................................................................48 5.3 输入..........................................................................................................................49 5.3.1 Access Input ...................................................................................................49 5.3.2 CSV file input ................................................................................................50 5.3.3 Cube 输入 多维立方体 ................................................................................51 5.3.4 Excel 输入......................................................................................................51 5.3.5 Fixed file input ...............................................................................................53 5.3.6 Generate random value ..................................................................................54 5.3.7 Get file Names................................................................................................55 5.3.8 Get Files Rows Count ....................................................................................55 5.3.9 Get data from XML........................................................................................55 5.3.10 LDAP Input..................................................................................................57 5.3.11 LDIF Input....................................................................................................58 5.3.12 Mondrian Input.............................................................................................60 5.3.13 Property Input...............................................................................................60 5.3.14 Streaming XML Input ..................................................................................61 5.3.15 XBase 输入 ..................................................................................................65 5.3.16 XML 输入....................................................................................................66 5.3.17 文本文件输入.............................................................................................70 5.3.18 生成记录.....................................................................................................71 5.3.19 获取系统信息.............................................................................................71 5.3.20 表输入.........................................................................................................73 5.4 输出..........................................................................................................................75 5.4.1 Access Output.................................................................................................75 5.4.2 Cube 输出 ......................................................................................................75 5.4.3 Excel Output...................................................................................................76 5.4.4 Properties Output ...........................................................................................76 5.4.5 SQL File Output.............................................................................................78 5.4.6 XML 输出......................................................................................................79 5.4.7 删除...............................................................................................................80 5.4.8 插入/更新......................................................................................................81 5.4.9 文本文件输出...............................................................................................83 5.4.10 更新.............................................................................................................83 5.4.11 表输出.........................................................................................................84 5.5 查询..........................................................................................................................85 5.5.1 Check if a column exists ................................................................................85
5.5.2 File Exists.......................................................................................................86 5.5.3 HTTP client....................................................................................................87 5.5.4 Table exists.....................................................................................................88 5.5.5 Web 服务查询................................................................................................89 5.5.6 数据库查询...................................................................................................89 5.5.7 数据库连接...................................................................................................91 5.5.8 流查询...........................................................................................................92 5.5.9 调用 DB 存储过程 .......................................................................................94 5.6 转换..........................................................................................................................94 5.6.1 Abort...............................................................................................................95 5.6.2 Add XML 增加 XML....................................................................................96 5.6.3 Add a checksum 增加检查和 .......................................................................97 5.6.4 Analytic Query 分析查询 .............................................................................98 5.6.5 Append Streams .............................................................................................98 5.6.6 Blocking Step 被冻结的步骤 .......................................................................99 5.6.7 Clone row.......................................................................................................99 5.6.8 Closure Generator 闭包生成器 ..................................................................100 5.6.9 Data Validator 数据检测.............................................................................100 5.6.10 Delay row 延迟行.....................................................................................101 5.6.11 Identify last row in a stream 标记流中最后一行 .....................................101 5.6.12 Metadata structure of stream 流中元数据结构 .........................................102 5.6.13 Null if 设置为空值 ...................................................................................102 5.6.14 Row Normaliser 行正规化 .......................................................................103 5.6.15 Split field to rows 分离行 .........................................................................103 5.6.16 Switch / case...............................................................................................104 5.6.17 XSD Validator ............................................................................................104 5.6.18 XSL Transformation...................................................................................105 5.6.19 值映射.......................................................................................................106 5.6.20 分组...........................................................................................................107 5.6.21 去除重复记录...........................................................................................108 5.6.22 增加常量...................................................................................................109 5.6.23 增加序列...................................................................................................109 5.6.24 字段选择...................................................................................................110 5.6.25 拆分字段................................................................................................... 111 5.6.26 排序记录...................................................................................................112 5.6.27 空操作.......................................................................................................113 5.6.28 行扁平化...................................................................................................113 5.6.29 行转列.......................................................................................................115 5.6.30 计算器.......................................................................................................116 5.6.31 过滤记录...................................................................................................119 5.7 连接.......................................................................................................................120 5.7.1 Merge Join....................................................................................................120 5.7.2 Sorted Merge................................................................................................121 5.7.3 XML Join .....................................................................................................122
5.7.4 合并记录.....................................................................................................122 5.7.5 记录关联(笛卡尔输出).........................................................................123 5.8 脚本........................................................................................................................124 5.8.1 Modified Java Script Calue..........................................................................124 5.8.2 Regex Evaluation .........................................................................................125 5.8.3 执行 SQL 脚本 ...........................................................................................127 5.9 数据仓库................................................................................................................128 5.9.1 维度更新/查询............................................................................................128 5.9.2 联合更新/查询............................................................................................129 5.10 映射......................................................................................................................130 5.10.1 映射(子转换).......................................................................................130 5.10.2 映射输入规范...........................................................................................131 5.10.2 映射输出规范...........................................................................................132 5.11 作业......................................................................................................................132 5.11.1 Get Variables 获得变量 .............................................................................132 5.11.2 Get files from result....................................................................................133 5.11.3 Set Variables 设置变量 .............................................................................134 5.11.4 Set files in result.........................................................................................135 5.11.5 从结果获取记录.......................................................................................135 5.11.6 复制记录到结果.......................................................................................136 5.12 内联......................................................................................................................136 5.12.1 Injector .......................................................................................................136 5.12.2 Socket reader..............................................................................................137 5.12.3 Socket writer ..............................................................................................137 5.13 实验......................................................................................................................138 5.14 不推荐的..............................................................................................................138 5.14.1 聚合记录...................................................................................................139 5.15 Bulk loading..........................................................................................................140 5.16 History...................................................................................................................142 6. 任务 Jobs 核心对象.........................................................................................................143 6.1 General....................................................................................................................143 6.1.1 Dummy Job ..................................................................................................143 6.2 通用........................................................................................................................144 6.2.1 START..........................................................................................................144 6.2.2 Dummy Job ..................................................................................................144 6.2.3 中断任务.....................................................................................................145 6.2.4 显示消息对话框.........................................................................................145 6.2.5 任务(Job) ....................................................................................................146 6.2.6 Ping a host....................................................................................................147 6.2.7 Success.........................................................................................................148 6.2.8 文本输出.....................................................................................................148 6.2.9 Write to Log .................................................................................................149 6.3 邮件........................................................................................................................149 6.3.1 Write to Log .................................................................................................149
6.3.2 Mail ..............................................................................................................150 6.4 文件管理................................................................................................................151 6.4.1 向结果中添加文件名.................................................................................152 6.4.2 比较文件夹.................................................................................................152 6.4.3 拷贝文件.....................................................................................................153 6.4.4 拷贝或移动结果文件名.............................................................................153 6.4.5 新建文件夹.................................................................................................154 6.4.6 新建文件.....................................................................................................155 6.4.7 删除文件.....................................................................................................155 6.4.8 从结果集中删除文件名.............................................................................155 6.4.9 删除文件.....................................................................................................156 6.4.10 删除文件夹...............................................................................................156 6.4.11 文件比较...................................................................................................157 6.4.12 HTTP..........................................................................................................157 6.4.13 Move FIles .................................................................................................158 6.4.14 文件解压缩................................................................................................159 6.4.15 等待文件...................................................................................................159 6.4.16 文件打包...................................................................................................160 6.5 条件........................................................................................................................161 6.5.1 检查文件夹是否为空.................................................................................161 6.5.2 检查文件是否存在.....................................................................................161 6.5.3 检查数据库表中的列是否存在.................................................................162 6.5.4 检查文件存在.............................................................................................162 6.5.5 检查表是否存在.........................................................................................163 6.5.6 等待.............................................................................................................163 6.6 脚本........................................................................................................................164 6.6.1 Mail ..............................................................................................................164 6.6.2 SQL ..............................................................................................................164 6.6.3 SHELL .........................................................................................................165 6.7 批量加载................................................................................................................166 6.7.1 批量从 Mysql 中加载数据至文件.............................................................166 6.7.2 从文件中向 MS SQL Server 数据库中批量加载 .....................................166 6.7.3 从文件中向 Mysql 数据库中批量加载......................................................167 6.8 XML........................................................................................................................168 6.8.1 Check if XML File is well formed ...............................................................168 6.8.2 DTD Validator..............................................................................................169 6.8.3 XSD Validator ..............................................................................................169 6.8.4 XSL Transformation.....................................................................................170 6.9 文件传输................................................................................................................171 6.9.1 FTP...............................................................................................................171 6.9.2 FTP Delete....................................................................................................173 6.9.3 Put a file with FTP .......................................................................................173 6.9.4 Put a file with SFTP .....................................................................................175 6.9.5 SSH2 Get......................................................................................................176
6.9.6 SSH2 Put......................................................................................................177 6.9.7 Secure FTP...................................................................................................179 6.10 资源库..................................................................................................................180 6.10.1 Check if connected to repository................................................................180 6.10.2 Export repository to XML file....................................................................181 6.11 实验......................................................................................................................181 6.11.1 Evaluate rows number in a table ................................................................182 6.11.2 MS Access Bulk Load ................................................................................182 6.11.3 Set variables ...............................................................................................184 6.11.4 Simple evaluation.......................................................................................184 6.11.5 Truncate tables............................................................................................185 6.11.6 Wait for SQL ..............................................................................................186
概述 Kettle 中文名称叫水壶,该项目的主程序员 MATT 希望把各种数据放到一个壶里然后以 一种指定的格式流出。Kettle 主要包括四部分,分别为 Chef,Spoon,Kitchen,Pan。 Kettle 提供一个图形用户界面 Spoon,用来设计数据转换过程。在 Spoon 中,用户可以使用左面的 组件树,在右面的面板中设计转换流程,并在 Log View 面板中查看运行结果。该文章介绍 了图形用户界面 Spoon 中各组件的使用。 1.Kettle 资源库管理 登陆时可以选择”没有资源库”即可进入Kettle,此时所定义的转换和工作将只能存储在本地 磁盘上,以.ktr文件和.kjb文件的方式。若使用资源库登录,则所有定义的转换和工作将会 存储到资源库里。实际上,资源库就是一个数据库,比如SQL SERVER数据库,里面存储了 Kettle定义的元素的相关元数据,简单而言,就是元数据库。如果资源库创建完毕,则资源 库的相关信息将存储在文件 “reposityries.xml”中,它位于你的缺省home 目录的隐藏目 录“.kettle”中。如果是windows 系统,这个路径就是 c:\Documents andSettings\\.kettle。 1.1 新建资源库 1)新建到资源库的的数据库连接 点击”新建”按钮,弹出以下对话框:
a.数据库连接是让我们选择数据库连接,这里意思为我们可以在本地数据库新建一个或多个 数据库(比如 sql server 数据库)做资源库,然后连接到该资源库 b.表示资源库的名字 我们最初没有资源库,则点击 新建按钮,新建一个数据库(做资源库)连接,注意前提是(sql server 为例)已存在该数据库, 如何建立到数据库的连接: (1)我们首先以在 sql server 下创建的 KettleZyk 数据库为例,我们选择 ODBC 方式连接到 sql server。首先配置 KettleZyk 数据库的 ODBC 源: (2)点击新建按钮,做如下设置。点击编辑按钮可以编辑这个连接。删除则删除该连接。 2)创建资源库 选择好刚创建的数据库连接,填写好资源库名称,点击 按钮创建新资源库 弹出:
分享到:
收藏