0%

Order Frequency Analysis

  1. 分析的核心是 Customer order frequency
  2. 从单个Customer角度来看:
    • 问题1:一定时间范围内,下单次数和下单金额,哪个更重要?
    • 问题2:如何体现下单集中度,例如上半年下单6次,下半年没有下单,这样全年看起来,平均两个月下一次单,而实际上和事实严重不符。
    • 问题3:如何更准确地表现客户 当前年 / 连续多年 的下单规律。
  3. 从一类Customer角度来看:
    • 问题1:对于总销售额相同/相近的客户,如何从Order Frequency的角度来区分它们。是下单频率高的更有价值,还是平均单价高的更有价值。或者说,需要从不同的角度来看,对客户做出一个标签。例如,一年内总销售额相同的两个客户,下单次数少的必然销售单价高,下单次数多的必然销售单价低,从我们的角度来说,这其实是两类客户,不能因为一年内总销售额相同而划为同一类客户。
    • 问题2:不同的维度下的客户群体是否有下单周期共性,下单月份共性,是随机分布,还是会集中于某些季节,集中于某些月份。如果是集中的,那么集中的季节或者月份,是不是可解释的。
    • 问题3:紧跟问题2,如果看起来下单周期是随机的,那么如果我们下沉到Product Category维度,是否依然是随机的?Brand维度呢?Material维度呢?
  4. 客户分为新客户老客户,从新老客户角度来看:
    • 问题1: 如何定义新老客户。
    • 问题22号问题3号问题用来分析老客户是没问题,但是对于新客户,如果定义新客户为一年内Created的客户,那么以上分析一定程度上是无法得出正确结论的。那么就需要在分析的时候,对于新老客户进行区分。
  5. 从预测角度来看:
    • 问题1: 经过以上分析,是否可以一定程度上对客户下单形成一定的规律性结果,从而根据规律进行每个客户的下单预测。此预测并不作为实际Business中应用,但是从数据分析角度来说,可以作一定程度上的Simulation。
    • 问题2: 是否可以根据预测,定期提醒客户下单。

  1. 安装好 Mysql 在 Mac 中。
  2. 配置 PATH : PATH="$PATH":/usr/local/mysql/bin
  3. 登录 Mysql: mysql -u root -p
  4. 关闭 Mysql 限制:set @@global.sql_mode='';
  5. 然后登录 Navicat 连接数据库,导入 CSV。

我最近开始遇到MySQL的的非法混合归类错误。偶尔的查询会导致

Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '='

解决方法如下:

SET collation_connection = 'utf8_general_ci';
ALTER DATABASE ECDatas CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE Count_DishesAll_Bapa15_100 CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE Count_DishesAll_Cd CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE Count_DishesAll_Fd CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE Count_DishesAll_Hamburger CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE Count_DishesAll_Lao CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE Count_DishesAll_Milk CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE Count_DishesAll_Naigai CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE Count_DishesAll_Naigai CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE Count_DishesAll_Pizza CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE Areas CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE Cities CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE DivisionsOfChina CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE Streets CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE Villages CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE DishesAll_Bapa15_100 CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE DishesAll_Cd CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE DishesAll_Fd CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE DishesAll_Milktea CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE DishesAll_Qsr_20_60 CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE CustomerData CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE ProductData CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE SSD2018 CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE SSD20162017 CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE CtripHotels CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE PoiData CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE PoiDataTotal CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE CustomerAddress CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE SSDTotal CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;

一、MySQL删除单列重复数据保留Id最小的

需求:如表所示,其中GOODS_ID有数据重复的现象,现在要删除MER_GROUP_ID为90的重复数据

GOODS_PRICE_ID GOODS_ID MER_GROUP_ID
1 aa 90
2 aa 90
3 bb 90
4 bb 90
5 cc 99
6 cc 98
7 dd 71
8 dd 19

1. 里面嵌套的sql语句是找出重复列GOODS_ID中GOODS_PRICE_ID最大的数据,用SELECT语句将重复字段查找出来。

SELECT max(GOODS_PRICE_ID) as GOODS_PRICE_ID 
       FROM GOODS_PRICE 
       WHERE MER_GROUP_ID = '90' 
       GROUP BY GOODS_ID 
       HAVING count(GOODS_ID) > 1 
阅读全文 »

开始使用

我下面所有的SQL语句是基于Mysql5.6+运行。

Mysql为关系型数据库(Relational Database Management System),一个关系型数据库由一个或数个表格组成:

  • 表头(header): 每一列的名称;
  • 列(col): 具有相同数据类型的数据的集合;
  • 行(row): 每一行用来描述某个人/物的具体信息;
  • 值(value): 行的具体信息, 每个值必须与该列的数据类型相同;
  • 键(key): 表中用来识别某个特定的人\物的方法, 键的值在当前列中具有唯一性。

登录MySQL

阅读全文 »

修改城市名为统一名称

//临时使用语句,和主题无关。
CREATE TABLE Count_dishesAll_Pizza (
	SELECT
		City,
		COUNT( * ) AS CountCity 
	FROM
		SSD2018 
	GROUP BY
		City 
	ORDER BY
		CountCity DESC 
	);

//修改城市名为统一名称

UPDATE SSDTotal 
SET City = REPLACE ( city, 'Shanghai', '上海' );
UPDATE SSDTotal 
SET City = REPLACE ( city, 'shanghai', '上海' );
UPDATE SSDTotal 
SET City = REPLACE ( city, 'Beijing', '北京' );
UPDATE SSDTotal 
SET City = REPLACE ( city, 'beijing', '北京' );
UPDATE SSDTotal 
SET City = REPLACE ( city, 'Ningbo', '宁波' );
UPDATE SSDTotal 
SET City = REPLACE ( city, 'chongqing', '重庆' );
UPDATE SSDTotal 
SET City = REPLACE ( city, 'Huzhou', '湖州' );
UPDATE SSDTotal 
SET City = REPLACE ( city, 'Hohhot', '呼和浩特' );
UPDATE SSDTotal 
SET City = REPLACE ( city, ' Nanjing', '南京' );
UPDATE SSDTotal 
SET City = REPLACE ( city, 'guangzhou', '广州' );
UPDATE SSDTotal 
SET City = REPLACE ( city, 'Qingdao', ' 青岛' );
UPDATE SSDTotal 
SET City = REPLACE ( city, 'Shenzhen', '深圳' );
UPDATE SSDTotal 
SET City = REPLACE ( city, 'Chengdu', '成都' );
UPDATE SSDTotal 
SET City = REPLACE ( city, 'chengdu', '成都' );
UPDATE SSDTotal 
SET City = REPLACE ( city, 'Shenyang', '沈阳' );
UPDATE SSDTotal 
SET City = REPLACE ( city, 'Guangzhou', '广州' );
UPDATE SSDTotal 
SET City = REPLACE ( city, 'Wuhu', '芜湖' );
UPDATE SSDTotal 
SET City = REPLACE ( city, 'Hangzhou', '杭州' );
UPDATE SSDTotal 
SET City = REPLACE ( city, 'Zhongshan', '中山' );
UPDATE SSDTotal 
SET City = REPLACE ( city, 'Taiyuan', '太原' );
UPDATE SSDTotal 
SET City = REPLACE ( city, 'Shantou', '汕头' );
UPDATE SSDTotal 
SET City = REPLACE ( city, 'Wuhan', '武汉' );
UPDATE SSDTotal 
SET City = REPLACE ( city, 'Jinan', '济南' );
UPDATE SSDTotal 
SET City = REPLACE ( city, 'Dongguan', '东莞' );
UPDATE SSDTotal 
SET City = REPLACE ( city, 'Nanning', '南宁' );
UPDATE SSDTotal 
SET City = REPLACE ( city, 'wenzhou', '温州' );
UPDATE SSDTotal 
SET City = REPLACE ( city, 'changsha', '长沙' );
UPDATE SSDTotal 
SET City = REPLACE ( city, 'zhengzhou', '郑州' );
UPDATE SSDTotal 
SET City = REPLACE ( city, 'Zhuhai', '珠海' );
UPDATE SSDTotal 
SET City = REPLACE ( city, 'Fuzhou', '福州' );
UPDATE SSDTotal 
SET City = REPLACE ( city, 'Suzhou2', '苏州' );
UPDATE SSDTotal 
SET City = REPLACE ( city, 'Wuxi', '无锡' );
UPDATE SSDTotal 
SET City = REPLACE ( city, 'Tianjin', '天津' );
UPDATE SSDTotal 
SET City = REPLACE ( city, 'Suzhou', '苏州' );
UPDATE SSDTotal 
SET City = REPLACE ( city, 'nanning', '南宁' );
UPDATE SSDTotal 
SET City = REPLACE ( city, '广东省广州', '广州' );
UPDATE SSDTotal 
SET City = REPLACE ( city, 'Nanjing', '南京' );
UPDATE SSDTotal 
SET City = REPLACE ( city, 'Guazhou', '瓜州' );
UPDATE SSDTotal 
SET City = REPLACE ( city, '四川郫县', '郫县' );
UPDATE SSDTotal 
SET City = REPLACE ( city, '通州区', '通州' );
UPDATE SSDTotal 
SET City = REPLACE ( city, '青岛胶州', '胶州' );
UPDATE SSDTotal 
SET City = REPLACE ( city, '市', '' );

# 修改 Mysql 运行模式
```sql
//修改 SQL 运行模式
SET sql_mode=(SELECT REPLACE(@@sql_mode,'ONLY_FULL_GROUP_BY',''));

数据分析

/* 查看表内 Billing Data 列表。*/ 
SELECT
`Billing Date`,
COUNT( `Billing Date` ) 
FROM
	SSD2018 
GROUP BY
	`Billing Date` 

/*查询某月客户聚合销量。*/
SELECT
	SaleOffice,
	CustomerName,
	round( sum( `Sale NET Value` ), 2 ) AS '2018-08' 
FROM
	SSD2018 
WHERE
	MONTH ( `Billing Date` ) = MONTH ( '2018-08-01' ) 
	AND YEAR ( `Billing Date` ) = YEAR ( curdate( ) ) 
GROUP BY
	CustomerName,
	SaleOffice 

/*查询某月*/
SELECT
	* 
FROM
	SSD2018 
WHERE
	MONTH ( `Billing Date` ) = MONTH ( '2018-08-01' ) 
	AND YEAR ( `Billing Date` ) = YEAR ( curdate( ) )

删除数据

阅读全文 »

All_New_Bapa

SELECT
	`poidata`.`id` AS `id`,
	`poidata`.`shop_id` AS `shop_id`,
	`poidata`.`name` AS `name`,
	`poidata`.`real_city` AS `real_city`,
	`poidata`.`province` AS `province`,
	`poidata`.`phone` AS `phone`,
	`poidata`.`address` AS `address`,
	`poidata`.`avg_price` AS `avg_price`,
	`poidata`.`small_cate` AS `small_cate`,
	`poidata`.`dishes` AS `dishes` 
FROM
	`poidata` 
WHERE
	(
		`poidata`.`small_cate` IN (
			'奶茶/果汁',
			'面包/饮品',
			'面包甜点',
			'甜品饮品',
			'甜品',
			'饮料冰品',
			'冰淇淋',
			'蛋挞',
			'冰品',
			'新式甜品',
			'雪糕饮品',
			'甜品饮料',
			'饮品',
			'西式甜点',
			'港台甜品',
			'传统甜品' 
		) 
	) 

All_New_Casual_dining

SELECT
	`poidata`.`id` AS `id`,
	`poidata`.`shop_id` AS `shop_id`,
	`poidata`.`name` AS `name`,
	`poidata`.`real_city` AS `real_city`,
	`poidata`.`province` AS `province`,
	`poidata`.`phone` AS `phone`,
	`poidata`.`address` AS `address`,
	`poidata`.`avg_price` AS `avg_price`,
	`poidata`.`small_cate` AS `small_cate`,
	`poidata`.`dishes` AS `dishes` 
FROM
	`poidata` 
WHERE
	(
	`poidata`.`small_cate` IN ( '西餐', '法国菜', '西式正餐', '牛排', '创意菜', '西班牙菜', '印度菜', '意大利菜', '日式烧烤/烤肉', '日本菜', '拉美烧烤', '俄罗斯菜', '日式自助' ) 
	)

All_New_Coffee

阅读全文 »

City字段

  1. “上海” - “上海”
  2. “Zhuhai” - “珠海”
  3. Shanghai”- “shanghai”
  4. “010-北京”- “北京”

DistributeChannel

  1. Retail
    订单Customer均为已冻结Retail客户
    Quantity均为0/负值

  2. Internal

    订单Customer均为BJ001231
    邦士(天津)食品有限公司BSI
    销售数据正常
    Material主要为ELV & IDF的Cream & Cheese

  3. Food Service

    • 当BusinessUnit字段为空时,订单Customer包含YP(领样) & LOG & 下图类型客户(下图仅为一部分客户)

    • 当BusinessUnit字段为空时,Channel也为空。

      其中YP客户Net Value为 0。
      image-20210812101631598

    • 当BusinessUnit字段为Retail时,订单Customer主要为下图类型客户(下图仅为一部分客户)以及YG(员工)。

      销售数据正常,
      image-20210812105708373

    • 当BusinessUnit字段为 Food Service 时,订单Customer包含YG&正常Customer。

    Plant

    1. SINODIS

      BeiJing
      Chengdu
      GuangZhou
      ShangHai
      ShangHai External
      ShenZhen

    2. BSI

      Tianjin : B2B & QSQ Customer Category。

      对应SaleOffice:
      BeiJing
      BSI-Domestic
      ShangHai

      1. Plant 和 Sale Office 的对应关系
      SaleOffice BeiJing Chengdu GuangZhou ShangHai ShangHai External ShenZhen Tianjin
      BeiJing
      BSI-Domestic
      ChengDu
      DaLian
      GuangZhou
      ShangHai
      ShenZhen

Channel

阅读全文 »

首先通过 Navicat Mysql 在目标数据库建立新的表。

新表和要导入的数据应该有一样的数据结构。主要是字段以及字段的类型。

远程登录 Mysql

打开终端,输入如下命令:

mysql --local-infile -h (IP 地址或者 Mysql 域名地址) -P (Mysql 端口) -u (用户名) -p
阅读全文 »

终端命令:

打开 “终端”,根据自己的版本选择命令

早期的 OS X(10.6~10.8)系统可以使用如下两条命令来开始或者关闭系统隐藏文件的显示:
defaults write com.apple.Finder AppleShowAllFiles Yes && killall Finder // 显示隐藏文件 defaults write com.apple.Finder AppleShowAllFiles No && killall Finder // 不显示隐藏文件

当升级到 OS X 10.9 Mavericks 版本之后,这两条命令需要做一些修改,变成了如下命令:
defaults write com.apple.finder AppleShowAllFiles Yes && killall Finder // 显示隐藏文件 defaults write com.apple.finder AppleShowAllFiles No && killall Finder // 不显示隐藏文件

快捷键:

阅读全文 »